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(57) Abstract 



A method is provided for identifying a methylated CpG containing nucleic acid including contacting a nucleic acid with a methylation 
sensitive restriction endonuclease that cleaves unmethylated CpG sites, contacting the sample with an isoschizomcr of the methylation 
sensitive restriction endonuclease, wherein the isoschizomer cleaves both methylated and unmethylated CpG sites. The method also includes 
adding an oligonucleotide under conditions and for a time to allow ligation of the oligonucleotide to nucleic acid cleaved by the restriction 
endonuclease, and amplifying the nucleic acid. A method is also provided for detecting an age associated disorder including contacting 
a nucleic acid with a methylation sensitive restriction endonuclease that cleaves unmethylated CpG sites, contacting the sample with an 
isoschizomer of the methylation sensitive restriction endonuclease, wherein the isoschizomer cleaves both methylated and unmethylated 
CpG sites. The method also includes adding an oligonucleotide under conditions and for a time to allow ligation of the oligonucleotide to 
nucleic acid cleaved by the restriction endonuclease, and amplifying the nucleic acid; The amplified digested nucleic acid is adhered to a 
membrane, and hybridized with a probe of interest. A kit useful for detection of a CpG containing nucleic acid is also provided. 
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METHYLATED CpG ISLAND AMPLIFICATION (MCA) 

STATEMENT A S TO FEDERALLY SPONSOR ED RESEARCH 

This invention was made with Government support imder Grant No. CA43318 and 
CA54396, awarded by the National Cancer Institute and Grant No. CA43318, a Colon Cancer 
Spore Grant. The government may have certain rights in the invention. 

F tE LP QF TH F JtNYENTI QN 
The present invention relates generally to regulation of gene expression and more 
specifically to a method of determining the DNA methylation status of CpG sites in a given 
locus. 



BACKGROUND OF THE INVENTION 

DNA methylases transfer methyl groups from the universal methyl donor S-adenosyl 
methionine to specific sites on the DNA. Several biological functions have been attributed to 
the methylated bases in DNA. The most established biological function for methylated DNA 

15 is the protection of DNA from digestion by cognate restriction enzymes. The restriction 
modification phenomenon has, so far, been observed only in bacteria. Mammalian cells, 
however, possess a different methylase that exclusively methylates cytosine residues that are 
5' neighbors of guanine (CpG). This modification of cytosine residues has important 
regulatory effects on gene expression, especially when involving CpG rich areas, known as 

20 CpG islands, located in the promoter regions of many genes. 

Methylation has been shown by several lines of evidence to play a role in gene 
activity, cell differentiation, tumorigenesis, X-chromosome inactivation, genomic imprinting 
and other major biological processes (Razin, A., H., and Riggs, R.D. eds. in DNA 
25 Methylation Biochemistry and Biologic al Significance. Springer- Verlag, New York, 1984). 
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Methylation has been shown by several lines of evidence to play a role 
in gene activity, cell differentiation, tiunorigenesis, X-chromosome 
inactivation, genomic imprinting and other major biological processes 
(Razm, A. H. and Riggs, R.D. eds. in DNA Methvlation Biochemistrv and 
5 Rmlogical Significance . Springer-Verlag, New York, 1984). In eukaryotic 
cells, methylation of cytosine residues that are immediately 5* to a guanosine, 
occurs predominantly in CG poor regions (Bird, A., Nature, 221:209, 1986). 
In contrast, CpG islands remain unmethylated in normal cells, except during 
X-chromosome inactivation (Migeon, et aly supra) and parental specific 

1 0 imprinting (Li, et al, Nature, 266:362, 1993) where methylation of 5' 

regulatory regions can lead to transcriptional repression. De novo methylation 
of the Rb gene has been demonstrated in a small fraction of retinoblastomas 
(Sakai, et al, Am, 1 Hum, Ge«e/., 4S:880, 1991), and recently, a more detailed 
analysis of the VHL gene showed aberrant methylation in a subset of sporadic 

15 renal cell carcinomas (Herman, et al, Proc, Natl. Acad ScL, US.A, £1:9700, 
1994). Expression of a tumor suppressor gene can also be abolished by £3fe 
novo DNA methylation of a normally unmethylated CpG island (Issa, et al, 
Nature Genet., 1:536, 1994; Herman, et al, supra; Merlo, et al, Nature Med, 
1:686, 1995; Herman, et al, Cancer Res., 5^:722, 1996; Graff, etaL, Cancer 

20 Res., 1^:5195, 1995; Herman, et ai, Cancer Res., 15:4525, 1995). 

Human cancer cells typically contain somatically altered nucleic acid, 
characterized by mutation, amplification, or deletion of critical genes. In 
addition, the nucleic acid fi-om human cancer cells often displays somatic 
25 changes in DNA methylation (E.R. Fearon, et al, Cell, ^:759, 1990; 

P.A. Jones, et ai, Cancer Res., ^:46l, 1986; R. Holliday, Science, 228:163, 
1987; A. De Bustros, et al, Proc. Natl Acad Set, USA, S5:5693, 1988); 
P.A Jones, et al, Adv. Cancer Res.,^\\, 1990; S.B. Baylin, et al, Cancer 
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Ce//^a:383, 1991; M. Makos, etal, Proc, Natl Acad ScL 52:1929, 
1992; N. Ohtani-Fujita, et al, Oncogene, fi:1063, 1993). However, the precise 
role of abnormal DNA methylation in human tumorigenesis has not been 
established. Aberrant methylation of normally unmethylated CpG islands has 
5 been described as a frequent event in immortalized and transformed cells, and 
has been associated v^th transcriptional inactivation of defined tumor 
suppressor genes in human cancers. In the development of colorectal cancers 
(CRC), a series of tumor suppressor genes (TSG) such as APC, p53, DCC and 
DPC4 are inactivated by mutations and chromosomal deletions (reviewed in 

1 0 Kinzler and Vogelstein 1 996). Some of these alterations result from a 

chromosomal instability phenotype described in a subset of CRC (Lengauer et 
ai, 1997a). Recently, an additional pathway has been shown to be involved in 
a familial form of CRC, hereditary non-polyposis colorectal cancer. The 
cancers from these patients show a characteristic mutator phenotype which 

1 5 causes microsatellite instability (MI), and mutations at other gene loci such as 
TGF'J3-RII (Markowitz et al , 1 995) and BAX (Rampino et al , 1 997). This 
phenotype usually results from mutations in the mismatch repair (MMR) 
genes hMSH2 and hMLHl (reviewed by Peltomaki, and de la Chapelle, 1997). 
A subset of sporadic CRC also show MI, but mutations in MMR genes appear 

20 to be less frequent in these tumors (Liu et al , 1 995; Moslein et al , 1 996). 

Another molecular defect described in CRC is CpG island (CGI) 
methylation. CGIs are short sequences rich in the CpG dinucleotide and can 
be found in the 5' region of about half of all human genes (Bird, 1986). 
25 Methylation of cytosine within 5' CGIs is associated with loss of gene 
expression and has been seen in physiological conditions such as X 
chromosome inactivation and genomic imprinting (reviewed in Latham, 
1996). Aberrant methylation of CGIs has been detected in genetic diseases 
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such as the fragile-X syndrome (Hansen et al, 1992), in aging cells (Issa et 
al, 1994) and in neoplasia. About half of the tumor suppressor genes which 
have been shown to be mutated in the germline of patients with familial cancer 
syndromes have also been shown to be aberrantly methylated in some 
5 proportion of sporadic cancers, including Rb^ VHL,pl6, hMLHl, and BRCAl 
(reviewed in Baylin et al, 1998; Jones 1997). TSG methylation m cancer is 
usually associated with (1) lack of gene transcription and (2) absence of 
coding region mutation. Thus it has been proposed that CGI methylation 
serves as an alternative mechanism of gene inactivation in cancer, 

10 

The causes and global patterns of CGI methylation in human cancers 
remain poorly defined. Aging could play a factor in this process because 
methylation of several CGIs could be detected in an age-related manner in 
normal colon mucosa as well as in CRC (Issa et aL, 1994), In addition, 

1 5 aberrant methylation of CGIs has been associated with the MI phenotype in 
CRC (Ahuja et al, 1997) as well as specific carcinogen exposures (Issa et ai, 
1996), However, an understanding of aberrant methylation in CRC has been 
somewhat limited by the small number of CGIs analyzed to date. In fact, 
previous studies have suggested that large numbers of CGIs are methylated in 

20 immortalized cell lines (Antequera et aL, 1990), and it is not well understood 
whether this global aberrant methylation is caused by the cell culture 
conditions or whether they are an integral part of the pathogenesis of cancer. 

Most of the methods developed to date for detection of methylated 
25 cytosine depend upon cleavage of the phosphodiester bond alongside cytosine 
residues, using either methylation-sensitive restriction enzymes or reactive 
chemicals such as hydrazine which differentiate between cytosine and its 5- 
methyl derivative. Genomic sequencing protocols which identify a 5-MeC 
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residue in genomic DNA as a site that is not cleaved by any of the Maxam 
Gilbert sequencing reactions have also been used, but still suffer 
disadvantages such as the requirement for large amount of genomic DNA and 
the difficulty in detecting a gap in a sequencing ladder which may contain 
5 bands of varying intensity. 



Mapping of methylated regions in DNA has relied primarily on 
Southern hybridization approaches, based on the inability of methylation- 
sensitive restriction enzymes to cleave sequences which contain one or more 
1 0 methylated CpG sites. This method provides an assessment of the overall 

methylation status of CpG islands, including some quantitative analysis, but is 
relatively insensitive and requires large amounts of high molecular weight 
DNA. 



1 5 Another method utilizes bisulfite treatment of DNA to convert all 

unmethylated cytosines to uracil. The altered DNA is amplified and 
sequenced to show the methylation status of all CpG sites. However, this 
method is technically difficult, labor intensive and without cloning amplified 
products, it is less sensitive than Southern analysis, requiring approximately 

20 1 0% of the alleles to be methylated for detection. 

Identification of the earliest genetic changes in tumorigenesis is a 
major focus in molecular cancer research. Diagnostic approaches based on 
identification of these changes are likely to allow implementation of early 
25 detection strategies and novel therapeutic approaches targeting these early 
changes might lead to more effective cancer treatment. 
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SU MMA RY O F T HE TNVENTIQN 

The invention provides a method for detecting a methylated CpG- 
containing nucleic acid. This method can be used to identify sequences which 
5 are differentially methylated during a disease process such as a cell 
proliferative disorder. 

In one embodiment, a method is provided for identifying a methylated 
CpG-containing nucleic acid. The method includes contacting a nucleic acid 

10 sample suspected of containing a CpG-containing nucleic acid, with a 

methylation sensitive restriction endonuclease that cleaves only immethylated 
CpG sites, under conditions and for a tune to allow cleavage of unmethylated 
nucleic acid; and contacting the sample with an isoschizomer of the 
methylation sensitive restriction endonuclease, wherein the isoschizomer of 

15 the methylation sensitive restriction endonuclease cleaves both methylated and 
immethylated CpG sites. Oligonucleotides are added to the nucleic acid 
sample under conditions and for a time to allow ligation of the 
oligonucleotides to nucleic acid cleaved by the restriction endonuclease and 
the digested nucleic acid is amplified for further analysis. 

20 

In another embodiment, a method is provided for detecting an age- 
associated disorder associated with methylation of CpG islands in a nucleic 
acid sequence of interest in a subject having or at risk of having said disorder. 
The method includes contacting a nucleic acid sample suspected of comprising 
25 a CpG-containing nucleic acid with a methylation sensitive restriction 

endonuclease that cleaves only unmethylated CpG sites under conditions and 
for a time to allow cleavage of unmethylated nucleic acid, and contacting the 
sample vnih an isoschizomer of the methylation sensitive restriction 
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endonuclease, wherein the isoschizomer of the methylation sensitive 
restriction endonuclease cleaves both methylated and unmethylated CpG sites. 
Oligonucleotides are added to the nucleic acid sample under conditions and 
for a time to allow ligation of the oligonucleotides to nucleic acid cleaved by 
5 the restriction endonuclease, and the digested nucleic acid is amplified. The 
amplified, digested nucleic acid is contacted with a membrane and the 
membrane is hybridized with a probe of interest. 

In yet another embodiment, a method is provided for evaluating the 
10 response of a cell to an agent. The method includes contacting a nucleic acid 
sample suspected of containing a CpG-containing nucleic acid with a 
methylation sensitive restriction endonuclease that cleaves only immethylated 
CpG sites, under conditions and for a time to allow cleavage of unmethylated 
nucleic acid, and contacting the sample with an isoschizomer of the 
1 5 methylation sensitive restriction endonuclease, wherein the isoschizomer of 
the methylation sensitive restriction endonuclease cleaves both methylated and 
unmethylated CpG sites. Oligonucleotides are added to the nucleic acid 
sample under conditions and for a time to allow ligation of the 
oligonucleotides to nucleic acid cleaved by the restriction endonuclease, and 
20 the digested nucleic acid is amplified. The amplified, digested nucleic acid is 
adhered to a membrane and the membrane is hybridized with a probe of 
interest. 

In a further embodiment, a kit for the detection of a methylated CpG- 
25 containing nucleic acid is provided. In one embodiment the kit includes a 
carrier means containing one or more containers including a container 
containing an oligonucleotide for ligation of the oligonucleotides to nucleic 
acid, a second container containing a methylation sensitive restriction 
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endonuclease and a third container containing an isoschizomer of the 
methylation sensitive endonuclease. In another embodiment the kit includes a 
carrier means containing one or more containers containing a membrane, 
wherein the membrane has a member of the group consisting SEQ ID N0:1, 
5 SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO: 9, 
SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID 
NO: 19, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, 
SEQ ID NO:27, SEQ ID NO:30, SEQ ID N0:31, SEQ ID NO:32, and SEQ ID 
NO:33 (MINTl, MINT2, MINT4, MINT6, MINT8, MINT 9, MINTIO, 
10 MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, 

MINT24, MINT27, MINT30, MINT31, MINT32, and MINT33 immobilized 
on the membrane. 

In a further embodiment, an isolated nucleic acid including a member 
1 5 selected from SEQ ID NO: 1 , SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 
SEQ ID N0:8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID 
N0:15, SEQ ID N0:17, SEQ ID N0:19, SEQ ID NO:20, SEQ ID NO:22, 
SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:27, SEQ ID NO:30, SEQ ID 
NO:31, SEQ ID NO:32, and SEQ ID NO:33 (MINTl, MINT2, MINT4, 
20 MINT6, MINTS, MINT 9, MINTIO, MINT14, MINT15, MINT17, MINT19, 
MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, 
MINT32, and MINT33) is provided. An isolated methylated nucleic acid 
sequence having a sequence as set forth in a member of the group consisting 
of SEQ ID NOs: 1 -33 (MINTl -33) is also provided. 



25 
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RRTEF DESC RIPTION OF TH E DRAWINGS 

FIG. 1 is a schematic diagram of MCA. A hypothetical fragment of 
genomic DNA is represented by a solid line, with 7 Smal sites depicted by tick 
marks. Methylated Smal sites are indicated by an m. Fragments B and D are 
5 CpG islands. B is methylated in both normal (right) and cancer (left), while D 
is differentially methylated in cancer. For MCA, unmethylated Smal sites are 
eliminated by digestion with Smal (which is methylation-sensitive and does 
not cleave when its recognition sequence CCCGGG contains a methylated 
CpG), which leaves the fragment blunt ended. Methylated Smal sites are then 

10 digested with the non-methylation sensitive Smal isoschizomer Xmal, which 
digests methylated CCCGGG sites, leaving a CGGG overhang (sticky ends). 
Adaptors are ligated to these sticky ends, and PCR is performed to amplify the 
methylated sequences. The MCA amplicons can be used durectly in a dot blot 
analysis to study the methylation status of any gene for which a probe is 

1 5 available (left). Alternatively, MCA products can be used to clone 
differentially methylated sequence by RDA (right). 

FIG. 2 shows an the nucleotide sequence of a differentially Methylated 
Clone, MINT2 obtained by MCA Followed by RDA. The restriction 
20 endonuclease sites for Smal are underlined. Primer sequences used for 

bisulfite-PCR are also imderlined. The restriction endonuclease site for BstUI 
used to detect methylation after bisulfite PCR is shown by a gray box. 

FIG. 3 show a map of the versican gene first exon (filled box) and 
25 flanking regions. The position of MINTl 1 is shown by a solid line (on top). 
CpG sites are indicated below. Location of the primers used for bisulfite-PCR 
are shown by arrows. 
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FIG. 4 is a pictorial representation of global hypermethylation in CRC. 
Each column represents a separate gene locus. Each row is a primary 
colorectal cancer (samples above the bold solid line) or polyp ( below the bold 
solid line). Black squares: methylation> 10%. Gray squares: 1-10% 
5 methylation. White squares: < 1% methylation. A: GH+MI+, B: GH+MI-, C: 
GH-MI+, D: GH-MI-, E: GH+, F: GH-. A-D are cancers. E and F are 
adenomas. MI denotes the presence of microsatellite instability. ND, not done. 

FIG, 5 shows a model integrating CGI methylation in colorectal 
10 carcinogenesis. 

FIGS. 6A-H are the nucleic acid sequences of MINTl-33 (SEQ ID 
NO: 1-33). 

15 PESCRIfTIQN OF THE f REFERRED EMBQPIMENTS 

The present invention provides a method for identifying a methylated 
CpG-containing nucleic acid called methylated CpG island amplification 
(MCA). MCA can be used to study methylation in normal and neoplastic 
cells, and allows rapid screening of nucleic acid samples for the presence of 
20 hypermethylation of specific genes. MCA can also be used to clone genes and 
nucleic acid sequences differentially methylated in normal and abnormal 
tissues and cells. 

It should be noted that as used herein and in the appended claims, the 
25 singular forms "a," ''and," and "the" include plural referents unless the context 
clearly dictates otherwise. Thus, for example, reference to "a cell" includes a 
plurality of such cells and reference to "the restriction enzyme" includes 
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reference to one or more restriction enzymes and equivalents thereof known to 
those skilled in the art, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein 
5 have the same meaning as commonly understood to one of ordinary skill in the 
art to which this invention belongs. Although any methods, devices and 
materials similar or equivalent to those described herein can be used in the 
practice or testing of the invention, the preferred methods, devices and 
materials are now described. 

10 

All publications mentioned herein are incorporated herein by reference 
in full for the purpose of describing and disclosing the methodologies which 
are described in the publications which might be used in connection with the 
presently described invention. The publications discussed above and 
1 5 throughout the text are provided solely for their disclosure prior to the filing 
date of the present application. Nothing herein is to be construed as an 
admission that the inventors are not entitled to antedate such disclosure by 
virtue of prior invention. 

20 Any nucleic acid sample, in purified or nonpurified form, can be 

utilized as the starting nucleic acid or acids, provided it contains, or is 
suspected of containing, a nucleic acid sequence containing the target locus 
(e.g., CpG-containing nucleic acid). In general the CpG-containing nucleic 
acid will be DNA. However, the process may employ, for example, samples 

25 that contain DNA, or DNA and RNA, including messenger RNA, wherein 
DNA or RNA may be single stranded or double stranded, or a DNA-RNA 
hybrid may be included in the sample. A mixture of nucleic acids may also be 
employed. The specific nucleic acid sequence to be detected may be a fraction 
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of a larger molecule or can be present initially as a discrete molecule, so that 
the specific sequence constitutes the entire nucleic acid. It is not necessary 
that the sequence to be studied be present initially in a pure form; the nucleic 
acid may be a minor fraction of a complex mixture, such as contained in 
5 whole human DNA. The nucleic acid may be contained in a biological 
sample. Such samples include but are not limited to a serum, urine, saliva, 
cerebrospinal fluid, pleural fluid, ascites fluid, sputum, stool, or biopsy 
sample. The nucleic acid-containing sample used for detection of methylated 
CpG may be from any source including, but not limited to, brain, colon, 
10 urogenital, hematopoietic, thymus, testis, ovarian, uterine, prostate, breast, 
colon, lung and renal tissue and may be extracted by a variety of techniques 
such as that described by Maniatis, et al rMolecular Cloning: a Laboratory 
Manual Cold Spring Harbor, NY, pp 280, 28 1 , 1982). 

1 5 The nucleic acid of interest can be any nucleic acid where it is 

desirable to detect the presence of a CpG island. In one embodiment, the CpG 
island comprises a CpG island located in a gene. A "CpG island" is a CpG 
rich region of a nucleic acid sequence. The nucleic acid sequence may be, for 
example, a pi 6, a Rb, a VHL, a hMLHl, or a BRCAl gene. Alternatively the 

20 nucleic acid of interest can be, for example, a MINTl -33 nucleic acid 

sequence. However, any gene or nucleic acid sequence of interest containing 
a CpG sequence can be detected using the method of the invention. 

The presence of methylated CpG in the nucleic acid-containing 
25 specimen may be indicative of a disorder. In one embodiment, the disorder is 
a cell proliferative disorder. A "cell proliferative disorder" is any disorder in 
which the proliferative capabilities of the affected cells is different from the 
normal proliferative capabilities of unaffected cells. An example of a cell 
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proliferative disorder is neoplasia. Malignant cells (Le., cancer) develop as a 
result of a multistep process. Specific, non-limiting examples of disorders 
associated with increased methylation of CpG-islands are colon cancer, lung 
cancer, renal cancer, leukemia, breast cancer, prostate cancer, uterine cancer, 
5 astrocytoma, glioblastoma, and neuroblastoma. 

In another embodiment, the disorder is an age-associated disorder. The 
term "age-associated disorder" is used to describe a disorder observed with the 
biological progression of events occurring over time in a subject. Preferably, 

10 the subject is a himian. Non-limiting examples of age-associated disorders 
include, but are not limited to, atherosclerosis, diabetes melitis, and dementia. 
An age-associated disorder may also be a cell proliferative disorder. 
Examples of age-associated disorders which are cell proliferative disorders 
include colon cancer, lung cancer, breast cancer, prostate cancer, and 

15 melanoma, amongst others. An age-associated disorder is further intended to 
mean the biological progression of events that occur during a disease process 
that affects the body, which mimic or substantially mimic all or part of the 
aging events which occur in a normal subject, but which occxxr in the diseased 
state over a shorter period of time. 

20 

In one embodiment, the age-associated disorder is a "memory 
disorders or learning disorders" which are characterized by a statistically 
significant decrease in memory or learning assessed over time by the Randt 
Memory Test (Randt et al, Clin. Neuropsychol,lA%A, 1980), Wechsler 
25 Memory Scale {1 Psych., 12:87-95, 1945), Forward Digit Span test (Craik, 
Age Differences in Human Memory, in: Handbook of the Psvchology of 
Aging. Birren, J., and Schaie, K., Eds., New York, Van Nostrand, 1977), 
Mini-Mental State Exam (Folstein et al, 1 of Psych Res. 12:189-192, 1975), 
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or California Verbal Learning Test (CVLT) wherein such 
non-neurodegenerative pathological factors as aging, anxiety, fatigue, anger, 
depression, confusion, or vigor are controlled for. (See, U.S. Patent No. 
5,063,206 for example). 

5 

If the sample is impure (e.g., plasma, serum, stool, ejaculate, sputum, 
saliva, cerebrospinal fluid or blood or a sample embedded in paraffin), it may 
be treated before amplification with a reagent effective for opening the cells, 
fluids, tissues, or animal cell membranes of the sample, and for exposing the 
1 0 nucleic acid(s). Methods for purifying or partially purifying nucleic acid from 
a sample are well known in the art (eg., Sambrook et al. Molecular Cloning : 
a Laboratory Manual . Cold Spring Harbor Press, 1989, herein incorporated by 
reference), 

1 5 In one embodiment, a method is provided for identifying a methylated 

CpG-containing nucleic acid, including contacting a nucleic acid sample 
suspected of comprising a CpG-containing nucleic acid with a methylation 
sensitive restriction endonuclease that cleaves only unmethylated CpG sites 
under conditions and for a time to allow cleavage of unmethylated nucleic 

20 acid. The sample is further contacted with an isoschizomer of the methylation 
sensitive restriction endonuclease, that cleaves both methylated and 
uimiethylated CpG-sites, under conditions and for a time to allow cleavage of 
methylated nucleic acid. Oligonucleotides are added to the nucleic acid 
sample under conditions and for a time to allow ligation of the 

25 oligonucleotides to nucleic acid cleaved by the restriction endonuclease, and 
the digested nucleic acid is amplified. Following identification, the 
methylated CpG-containing nucleic acid can be cloned, using method well 
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known to one of skill in the art (see Sambrook et al.. Molecular Cloning : a 
Laboratory Manual. Cold Spring Harbor Press, 1989). 

A "methylation sensitive restriction endonuclease" is a restriction 
5 endonuclease that includes CG as part of its recognition site and has altered 
activity when the C is methylated as compared to when the C is not 
methylated. Preferably, the methylation sensitive restriction endonuclease has 
inhibited activity when the C is methylated (e.g., Smal). Specific non-limiting 
examples of a methylation sensitive restriction endonucleases include Smal^ 

10 BssHll, or Hpall, Such enzymes can be used alone or in combination. Other 
methylation sensitive restriction endonucleases will be known to those of skill 
in the art and include, but are not limited to 5acII, Eagl, and BstUl, for 
example. An "isoschizomer" of a methylation sensitive restriction 
endonuclease is a restriction endonuclease which recognizes the same 

1 5 recognition site as a methylation sensitive restriction endonuclease but which 
cleaves both methylated and unmethylated CGs. One of skill in the art can 
readily determine appropriate conditions for a restriction endonuclease to 
cleave a nucleic acid (see Sambrook et al. Molecular Cloning: a Laboratory 
Manual . Cold Spring Harbor Press, 1989). Without being bound by theory, 

20 actively transcribed genes generally contain fewer methylated CGs than in 
other genes. 

In the method of the invention, a nucleic acid of interest is cleaved 
with a methylation sensitive endonuclease. In one embodiment, cleavage with 
25 the methylation sensitive endonuclease creates a sufficient overhang on the 
nucleic acid of interest. Following cleavage with the isoschizomer, the 
cleavage product can still have a sufficient overhang. An "overhang" refers to 
nucleic acid having two strands wherein the strands end in such a manner that 



wo 00/26401 



PCT/US99/25251 _ 



•16- 



10 



a few bases of one strand are not base paired to the other strand. A "sufficient 
overhang" refers to an overhang of sufficient length to allow specific 
hybridization of an oligonucleotide of interest. In one embodiment, a 
sufficient overhang is at least two bases in length. In another embodiment, the 
sufficient overhang is four or more bases in length. An overhang of a specific 
sequence on the nucleic acid of interest may be desired in order for an 
oligonucleotide of interest to hybridize. In this case, the isoschizomer can be 
used to create the overhang having the desired sequence on the nucleic acid of 
interest. 



In another embodiment, the cleavage with a methylation sensitive 
endonuclease results in a reaction product of the nucleic acid of interest that 
has a blunt end or an insufficient overhang. In this embodiment, an 
isoschizomer of the methylation sensitive restriction endonuclease can create a 
15 sufficient overhang on the nucleic acid of interest. "Blunt ends" refers to a 
flush ending of two stands, the sense stand and the antisense strand, of a 
nucleic acid. 

Once a sufficient overhang is created on the nucleic acid of interest, an 
20 oligonucleotide is ligated to the nucleic acid cleaved of interest which has 

been cleaved by the methylation specific restriction endonuclease. "Ligation" 
is the attachment of two nucleic acid sequences by base pairing of 
substantially complementary sequences or by the formation of a covalent 
bonds between two nucleic acid sequences. An "oligonucleotide" is a nucleic 
25 acid sequence of 2 to 40 bases in length. Preferably the oligonucleotide is 
fi-om 15 to 35 bases in length. In one embodiment, the oligonucleotide is 
ligated to the overhang on the nucleic acid sequence of interest by base 
pairing. 
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In one embodiment, two oligonucleotides are utilized to forai an 
adaptor. An "adaptor" is a double-stranded nucleic acid sequence with one 
end that has a sufficient single-stranded overhang at one or both ends such that 
5 the adaptor can be ligated by base-pairing to a sufficient overhang on a nucleic 
acid of interest that has been cleaved by a methylation sensitive restriction 
enzyme or an isoscliizomer of a methylation sensitive restriction enzyme. In 
one embodiment, two oligonucleotides can be used to form an adaptor; these 
oligonucleotides are substantially complementary over their entire sequence 

10 except for the region(s) at the 5' and/or 3' ends that will form a single stranded 
overhang. The single stranded overhang is complementary to an overhang on 
the nucleic acid cleaved by a methylation sensitive restriction enzyme or an 
isoschizomer of a methylation sensitive restriction enzyme, such that the 
overhang on the nucleic acid of interest vdll base pair with the 3' or 5' single 

15 stranded end of the adaptor under appropriate conditions. The conditions will 
vary depending on the sequence composition (GC vs AT), the length, and the 
type of nucleic acid (see Sambrook et ai, Molecular Cloning: a Laboratory 
Manual . 2nd Ed.; Cold Spring Harbor Laboratory Press, Plainview, NY, 
1998). 

20 

Following the ligation of the oligonucleotide, the nucleic acid of 
interest is amplified using a primer complementary to the oligonucleotide. 
Specifically, the term "primer" as used herein refers to a sequence comprising 
two or more deoxyribonucleotides or ribonucleotides, preferably more than 
25 three, and most preferably more than 8, which sequence is capable of initiating 
synthesis of a primer extension product, which is substantially complementary 
to a nucleic acid such as an adaptor or a ligated oligonucleotide. 
Environmental conditions conducive to synthesis include the presence of 
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nucleoside triphosphates and an agent for polymerization, such as DNA 
polymerase, and a suitable temperature and pH. The primer is preferably 
single stranded for maximum efficiency in amplification, but may be double 
stranded. If double stranded, the primer is first treated to separate its strands 
5 before being used to prepare extension products. In one embodiment, the 
primer is an oligodeoxyribo-nucleotide. The primer must be sufficiently long 
to prime the synthesis of extension products in the presence of the inducing 
agent for polymerization. The exact length of primer will depend on many 
factors, including temperature, buffer, and nucleotide composition. The 
1 0 oligonucleotide primer typically contains 12-20 or more nucleotides, although 
it may contain fewer nucleotides. 

Primers of the invention are designed to be "substantially" 
complementary to each strand of the oligonucleotide to be amplified and 

1 5 include the appropriate G or C nucleotides as discussed above. This means 
that the primers must be sufficiently complementary to hybridize with their 
respective strands under conditions which allow the agent for polymerization 
to perform. In other words, the primers should have sufficient 
complementarity with a 5* and 3' oligonucleotide to hybridize therewith and 

20 permit amplification of CpG containing nucleic acid sequence. 

Primers of the invention are employed in the amplification process 
which is an enzymatic chain reaction that produces exponential quantities of 
target locus relative to the niunber of reaction steps involved {e.g.^ polymerase 
25 chain reaction or PGR). Typically, one primer is complementary to the 

negative (-) strand of the locus and the other is complementary to the positive 
(+) strand. Annealing the primers to denatured nucleic acid followed by 
extension with an enzyme, such as the large fragment of DNA Polymerase I 
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(Klenow) and nucleotides, results in newly synthesized + and - strands 
containing the target locus sequence. Because these newly synthesized 
sequences are also templates, repeated cycles of denaturing, primer armealing, 
and extension results in exponential production of the region (Le., the target 
5 locus sequence) defined by the primer. The product of the chain reaction is a 
discrete nucleic acid duplex with termini corresponding to the ends of the 
specific primers employed. 

The oligonucleotide primers of the invention may be prepared using 
any suitable method, such as conventional phosphotriester and phosphodiester 
methods or automated embodiments thereof. In one such automated 
embodiment, diethylphosphoramidites are used as starting materials and may 
be synthesized as described by Beaucage, et al {Tetrahedron Letters, 
22: 1859-1 862, 1981). One method for synthesizing oligonucleotides on a 
modified solid support is described in U.S. Patent No. 4,458,066. 

Where the CpG-containing nucleic acid sequence of interest contains 
two strands, it is necessary to separate the strands of the nucleic acid before it 
can be used as a template for the amplification process. Strand separation can 
20 be effected either as a separate step or simultaneously with the synthesis of the 
primer extension products. This strand separation can be accomplished using 
various suitable denaturing conditions, including physical, chemical, or 
enzymatic means, the word "denaturing" includes all such means. One 
physical method of separating nucleic acid strands involves heating the nucleic 
25 acid until it is denatured. Typical heat denaturation may involve temperatures 
ranging fi-om about 80"* to 105*^0 for times ranging fi-om about 1 to 10 minutes. 
Strand separation may also be induced by an enzyme from the class of 
enzymes known as helicases or by the enzyme RecA, which has helicase 
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activity, and in the presence of riboATP, is known to denature DNA. The 
reaction conditions suitable for strand separation of nucleic acids with 
helicases are described by Kuhn Hof&nann-Berling (CSH-Quantitative 
Biology, 42:63, 1978) and techniques for using RecA are reviewed in 
5 C. Radding (Ann. Rev. Genetics, 1^:405-437, 1982). 

When complementary strands of nucleic acid or acids are separated, 
regardless of whether the nucleic acid was originally double or single 
stranded, the separated strands are ready to be used as a template for the 

10 synthesis of additional nucleic acid strands. This synthesis is performed under 
conditions allowing hybridization of primers to templates to occur. Generally 
synthesis occurs in a buffered aqueous solution, generally at a pH of about 7- 
9. Preferably, a molar excess (for genomic nucleic acid, usually about 10*:1 
primerrtemplate) of the two oligonucleotide primers is added to the buffer 

1 5 containing the separated template strands. It is understood, however, that the 
amount of complementary strand may not be known if the process of the 
invention is used for diagnostic applications, so that the amount of primer 
relative to the amount of complementary strand cannot be determined with 
certainty. As a practical matter, however, the amoxmt of primer added will 

20 generally be in molar excess over the amount of complementary strand 
(template) when the sequence to be amplified is contained in a mixture of 
complicated long-chain nucleic acid strands. A large molar excess is preferred 
to improve the efficiency of the process. 

25 The deoxyribonucleoside triphosphates dATP, dCTP, dGTP, and dTTP 

are added to the synthesis mixture, either separately or together with the 
primers, in adequate amounts and the resulting solution is heated to about 90**- 
100°C fi-om about 1 to 10 minutes, preferably firom 1 to 4 minutes. After this 
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heating period, the solution is allowed to cool to approximately room 
temperature, which is preferable for the primer hybridization. To the cooled 
mixture is added an appropriate agent for effecting the primer extension 
reaction (called herein "agent for polymerization"), and the reaction is allowed 

5 to occur under conditions known in the art. The agent for polymerization may 
also be added together with the other reagents if it is heat stable. This 
synthesis (or amplification) reaction may occur at room temperature up to a 
temperature above which the agent for polymerization no longer functions. 
Thus, for example, if DNA polymerase is used as the agent, the temperature is 

10 generally no greater than about 40°C. Most conveniently the reaction occurs 
at room temperature. 

The agent for polymerization may be any compoxmd or system which 
will function to accomplish the synthesis of primer extension products, 

1 5 including enzymes. Suitable enzymes for this purpose include, for example, 
£. coli DNA polymerase I, Klenow fragment ofE, coli DNA polymerase I, T4 
DNA polymerase, other available DNA polymerases, polymerase muteins, 
reverse transcriptase, and other enzymes, including heat-stable enzymes (z.e., 
those enzymes which perform primer extension after being subjected to 

20 temperatures sufficiently elevated to cause denaturation). Suitable enzymes 
will facilitate combination of the nucleotides in the proper manner to form the 
primer extension products which are complementary to each locus nucleic acid 
strand. Generally, the synthesis will be initiated at the 3* end of each primer 
and proceed in the 5* direction along the template strand, until synthesis 

25 terminates, producing molecules of different lengths. There may be agents for 
polymerization, however, which initiate synthesis at the 5' end and proceed in 
the other direction, using the same process as described above. 
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' Preferably, the method of amplifying is by PGR, as described herein 
and as is commonly used by those of ordinary skill in the art. However, 
alternative methods of amplification have been described and can also be 
employed. 

5 

Once amplified, the nucleic acid can be attached to a solid support, 
such as a membrane, and can be hybridized with any probe of interest, to 
detect any nucleic acid sequence. Several membranes are known to one of 
skill in the art for the adhesion of nucleic acid sequences. Specific non- 

10 limiting examples of these membranes include nitrocellulose (Nitropure) or 
other membranes used in for detection of gene expression such as 
polyvinylchloride, diazotized paper and other commercially available 
membranes such as Genescreen™, Zetaprobe™ (Biorad), and Nytran™. 
Methods for attaching nucleic acids to these membranes are well known to one 

1 5 of skill in the art. Alternatively, screening can be done in a liquid phase. 

In nucleic acid hybridization reactions, the conditions used to achieve a 
particular level of stringency will vary, depending on the nature of the nucleic 
acids being hybridized. For example, the length, degree of complementarity, 
20 nucleotide sequence composition (e.g., GC v. AT content), and nucleic acid 
type {e.g.y RNA v. DNA) of the hybridizing regions of the nucleic acids can be 
considered in selecting hybridization conditions. An additional consideration 
is whether one of the nucleic acids is immobilized, for example, on a filter. 

25 An example of progressively higher stringency conditions is as 

follows: 2 X SSC/0. 1% SDS at about room temperature (hybridization 
conditions); 0.2 x SSC/0. 1% SDS at about room temperature (low stringency 
conditions); 0.2 x SSC/0, 1% SDS at about 42''C (moderate stringency 
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conditions); and 0.1 x SSC at about 68^*0 (high stringency conditions). 
Washing can be carried out using only one of these conditions, e.g., high 
stringency conditions, or each of the conditions can be used, e.g., for 10-15 
minutes each, in the order listed above, repeating any or all of the steps listed. 
5 However, as mentioned above, optimal conditions will vary, depending on the 
particular hybridization reaction involved, and can be determined empirically. 
In general, conditions of high stringency are used for the hybridization of the 
probe of interest. 

10 The probe of interest can be detectably labeled, for example, with a 

radioisotope, a fluorescent compound, a bioluminescent compound, a 
chemiluminescent compound, a metal chelator, or an enzyme. Those of 
ordinary skill in the art will know of other suitable labels for binding to the 
probe, or will be able to ascertain such, using routine experimentation. 

15 

In one embodiment, representational difference analysis (RDA, see 
Lisitsyn et aL, Science 252:946-951, 1993, herein incorporated by reference) 
can be performed on CpG-containing nucleic acid following MCA. MCA 
utilizes kinetic and subtractive enrichment to purify restriction endonuclease 

20 fragments present in one population of nucleic acid fragments but not in 

another. Thus, RDA enables the identification of small differences between 
the sequences of two nucleic acid populations. RDA uses nucleic acid from 
one population as a "tester'* and nucleic acid from a second population as a 
"driver," in order to clone probes for single copy sequences present in (or 

25 absent from) one of the two populations. In one embodiment, nucleic acid 
from a "normal" individual or sample, not having a disorder such as a cell- 
proliferative disorder is used as a "driver," and nucleic acid from an "affected" 
individual or sample, having the disorder such as a cell proliferative disorder 
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is used as a "tester." In one embodiment, the nucleic acid used as a "tester" is 
isolated from an individual having a cell proliferative disorder such as colon 
cancer, lung cancer, renal cancer, leukemia, breast cancer, prostate cancer, 
uterine cancer, astrocytoma, glioblastoma, and neuroblastoma. The nucleic 

5 acid used as a "driver" is thus normal colon, normal lung, normal kidney, 
normal blood cells, normal breast, normal prostate, normal uterus, normal 
astrocytes, normal glial and normal neurons, respectively. In an additional 
embodiment, the nucleic acid used as a "driver" is isolated from an mdividual 
having a cell proliferative disorder such as colon cancer, lung cancer, renal 

10 cancer, leukemia, breast cancer, prostate cancer, uterine cancer, astrocytoma, 
glioblastoma, and neuroblastoma. The nucleic acid used as a "tester" is thus 
normal colon, normal lung, normal kidney, normal blood cells, normal breast, 
normal prostate, normal uterus, normal astrocytes, normal glial and normal 
neurons, respectively. One of skill m the art will readily be able to identify the 

1 5 "tester" nucleic acid usefiil with to identify methylated nucleic acid sequences 
in given "driver" population. 
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SCREENING AGENTS FOR AN EFFECT ON MFTHYLATION 

The invention provides a method for identifying an agent which can 
affect methylation. An agent can affect methylation by either increasing or 

5 decreasing methylation. The method includes incubating an agent and a 
sample containing a CpG-containing polynucleotide imder conditions 
sufficient to allow the components to interact, and measuring the effect of the 
compound on the methylation of the CpG-containing nucleic acid. In one 
embodiment, the sample is a cell expressing a polynucleotide of interest. In 

10 another embodiment, the sample is substantially purified nucleic acid. 

"Substantially purified" nucleic acid is nucleic acid which has been separated 
from the cellular components which naturally accompany it, or fi-om 
contaminating elements such as proteins, lipids, or chemical resins. 
Substantially pure nucleic acid can be extracted from any cell type, or can be 

1 5 chemically synthesized. Purity can be measured by any appropriate method, 
such as measuring the absorbance of light (e.g., A260/A28O ratio). 

The nucleic acid can be identified by the methylated CpG island 
amplification, as described above. The methylation of the polynucleotide in 

20 the sample can then be compared to the methylation of a control sample not 
incubated with the agent. The effect of the agent on methylation of a 
polynucleotide can be measured by assessing the methylation of the 
polynucleotide by the methods of the invention. Alternatively, the effect of 
the agent on methylation of a polynucleotide can be measured by assessing the 

25 expression of the polynucleotide of interest. Means of measuring expression 
are well known to one of skill in the art (e.g., Northern blotting or RNA dot 
blotting, amongst others). 
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The agents which affect methylation can include peptides, 
peptidomimetics, polypeptides, pharmaceuticals, and chemical compounds 
and biological agents. Psychotropic, antiviral, and chemotherapeutic 
compounds can also be tested using the method of the invention. 

5 

"Incubating" includes conditions which allow contact between the test 
agent and the cell of interest. "Contacting" includes in solution and solid 
phase. The test agent may also be a combinatorial library for screening a 
plurality of compounds. Agents identified in the method of the invention can 

10 be further cloned, sequenced, and the like, either in solution of after binding to 
a solid support, by any method usually applied to the isolation of a specific 
DNA sequence Molecular techniques for DNA analysis (Landegren et al. 
Science 242:229-237, 1988) and cloning have been reviewed (Sambrook et ai, 
Molecular Cloning: a Laboratorv Manual - 2nd Ed.; Cold Spring Harbor 

1 5 Laboratory Press, Plainview, NY, 1 998. 

The sample can be any sample of interest. The sample may be a cell 
sample or a membrane sample prepared from a cell sample. Suitable cells 
include any host cells containing a nucleic acid including a CpG island. The 
20 cells can be primary cells or cells of a cell line. 

In one embodiment, the agent is incubated with the sample of interest 
suspected of including a CpG-containing nucleic acid and methylation is 
evaluated by MCA. Thus, nucleic acid from the sample suspected of 
25 including a CpG-containing nucleic acid is contacted with a methylation 

sensitive restriction endonuclease which cleaves only unmethylated CpG sites 
under conditions and for a time to allow cleavage of xmmethylated nucleic 
acid. An isoschizomer of the methylation sensitive restriction endonuclease is 
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also utilized. An oligonucleotide is added to the nucleic acid sample under 
conditions and for a time to allow ligation of the oligonucleotide to nucleic 
acid cleaved by said restriction endonuclease, and the digested nucleic acid is 
amplified. The digested nucleic acid is adhered to a membrane, and the 
5 membrane is hybridized with a probe of interest. In one embodiment, 
representation difference analysis can also be performed. 

KIIS 

10 The materials for use in the assay of the mvention are ideally suited for 

the preparation of a kit. Such a kit may comprise a carrier means containing 
one or more container means such as vials, tubes, and the like, each of the 
container means comprising one of the separate elements to be used in the 
method. One of the container means can comprise a container containing an 

1 5 oligonucleotide for ligation to nucleic acid cleaved by a methylation sensitive 
restriction endonuclease. One or more container means can also be included 
comprising a primer complementary to the oligonucleotide. In addition, one 
or more container means can also be included which comprise a methylation 
sensitive restriction endonuclease. One or more container means can also be 

20 included containing an isoschizomer of said methylation sensitive restriction 
enzyme. 

In another embodiment, the kit may comprise a carrier means 
containing one or more container means comprising a solid support, wherein 
25 the solid support has a nucleic acid sequence selected from the group 
consisting of MINT 1-3 3 immobilized on the soUd support. In one 
embodiment, the solid support is a membrane. Several membranes are known 
to one of skill in the art for the adhesion of nucleic acid sequences. Specific 
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non-liraiting examples of these membranes include nitrocellulose (Nitropure) 
or other membranes used in for detection of gene expression such as 
polyvinylchloride, diazotized paper and other commercially available 
membranes such as Genescreen""*, Zetaprobe""* (Biorad), and Nytran™. The 
5 MINTl-33 sequences immobilized on the solid support can then be hybridized 
to nucleic acid sequences produced by performing the MCA procedure on the 
nucleic acids of a sample of interest in order to determine if the nucleic acid 
sequences contained in the sample are methylated. 

10 POT YNTiri E OTIDES AND POLYPEPTIDES 

In another embodiment, the invention provides isolated MINTl, 
MINT2, MINT4, MINT6, MINT8, MINT 9, MINTIO, MINT14, MINT15, 
MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, 
MINT30, MINT31, MINT32 and MINT33 polynucleotides ( SEQ ID N0:1, 

15 SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO: 9, 
SEQ ID NO:10, SEQ ID N0:14, SEQ ID NO:l5, SEQ ID NO:17, SEQ ID 
N0:19, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, 
SEQ ID NO:27, SEQ ID NO:30, SEQ ID N0:31, SEQ ID NO:32, and SEQ ID 
NO:33, respectively). These polynucleotides include DNA, cDNA and RNA 

20 sequences which encode MINTl , MINT2, MINT4, MINT6, MINTS, MINT 9, 
MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, 
MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 and MINT33 
polypeptides. It is understood that naturally occurring, synthetic, and 
intentionally manipulated polynucleotides are included. For example, 

25 MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, 
MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, 
MINT27, MINT30, MINT31, MINT32 and MrNT33 nucleic acids may be 
subjected to site-directed mutagenesis. The nucleic acid sequence for MINTl, 
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MINT2, MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, MINT15, 
MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, 
MINT30, MINT3 1, MINT32 and MINT33 also includes antisense sequences, 
and sequences encoding dominant negative forms of MINT 1, MINT2, 
5 MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, MINT15, MINT17, 
MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 
MINT3 1 , MINT32 and MINT33 . 

The invention provides methylated and immethylated forms of MINTl, 
10 MINT2, MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, MINT15, 
MINTl 7, MINTl 9, MINT20, MINT22, MINT23, MINT24, MINT27, 
MINT30, MINT3 1 , MINT32 and MINT33 polynucleotides ( SEQ ID NO: 1 , 
SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO: 9, 
SEQ ID NO:10, SEQ ID N0:14, SEQ ID NO:15, SEQ ID N0:17, SEQ ID 
1 5 NO: 19, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, 

SEQ ID NO:27, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, and SEQ ID 
NO:33, respectively). Methylated nucleic acid sequences are also provided 
which include MINT3, MINTS, MINT 7, MINTl 1, MINT12, MINT13, 
MINT16, MINT18, MINT21, MINT25, MINT26, MINT28, and MINT29 
20 (SEQ ID N0:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID NO: 1 1 , SEQ ID 
N0:12, SEQ ID N0:13, SEQ ID N0:16, SEQ IDN0:18, SEQ ID N0:21, 
SEQ ID N0:25, SEQ ID NO:26, SEQ ID NO:28, and SEQ ID NO:29, 
respectively). It is understood that naturally occurring, synthetic, and 
intentionally manipvdated polynucleotides are included. 



The polynucleotides of the invention includes "degenerate variants" 
sequences that are degenerate as a result of the genetic code. There are 20 
natural amino acids, most of which are specified by more than one codon. 
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Therefore, all degenerate nucleotide sequences are included in the invention as 
long as the amino acid sequence of a polypeptide encoded by the nucleotide 
sequence of SEQ ID NOs: 1-33 is functionally unchanged. 

5 Specifically disclosed herein are methylated and unmethylated isolated 

polynucleotide sequences of MINTl, MINT2, MINT4, MINT6, MINT8. 
MINT 9, MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 and 
MINT33. Preferably, the nucleotide sequence is SEQ ID NO: 1 , SEQ ID 

1 0 N0:2, SEQ ID NO:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO: 9, SEQ ID 
NO:10, SEQ ID N0:14, SEQ ID N0:15, SEQ ID NO:17, SEQ ID N0:19, 
SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID 
NO:27, SEQ ID NO:30, SEQ ID N0:31, SEQ ID NO:32, and SEQ ID NO:33, 
respectively. Specifically disclosed herein are methylated isolated 

15 polynucleotide sequences of MINT3, MINTS, MINT 7, MINTl 1, MINT12, 
MINT13, MINT16, MINT18, MINT21, MINT25, MINT26, MINT28, and 
MINT29. Preferably, the nucleotide sequence is SEQ ID N0:3, SEQ ID 
N0:5, SEQ ID N0:7, SEQ ID N0:1 1, SEQ ID NO: 12, SEQ ID NO: 13, SEQ 
ID NO: 16, SEQ ID NO: 18, SEQ ID N0:21, SEQ ID NO:25, SEQ ID NO:26, 

20 SEQ ID NO:28, and SEQ ID NO:29, respectively. The term "polynucleotide" 
or "nucleic acid sequence" refers to a polymeric form of nucleotides at least 
10 bases in length. By "isolated polynucleotide" is meant a polynucleotide 
that is not immediately contiguous with both of the coding sequences with 
which it is immediately contiguous (one on the 5' end and one on the 3' end) in 

25 the naturally occurring genome of the organism fi-om which it is derived. The 
term therefore includes, for example, a recombinant DNA which is 
incorporated into a vector; into an autonomously replicating plasmid or virus; 
or into the genomic DNA of a prokaryote or eukaryote, or which exists as a 
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separate molecule (e.g., a cDNA) independent of other sequences. The 
nucleotides of the invention can be ribonucleotides, deoxyribonucleotides, or 
modified forms of either nucleotide. The term includes single and double 
forms of DNA. 

5 

The polynucleotide encoding MINTl, MINT2, MINT4, MINT6, 
MINT8, MINT 9, MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINTS 1, MINT32 and 
MINT33 includes SEQ ID N0:1, SEQ ID N0:2, SEQ ID N0:4, SEQ ID 

10 N0:6, SEQ ID N0:8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 14, SEQ 
ID NO: 15, SEQ ID N0:17, SEQ ID N0:19, SEQ ID NO:20, SEQ ID NO:22, 
SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:27, SEQ ID NO:30, SEQ ID 
N0:31, SEQ ID NO:32, and SEQ ID NO:33, dominant negative forms of 
MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, 

15 MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, 
MINT27, MINT30, MINT31, MINT32 and MINT33, and nucleic acid 
sequences complementary to SEQ ID N0:1, SEQ ID N0:2, SEQ ID NO:4, 
SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID 
N0:14, SEQ ID N0:15, SEQ ID N0:17, SEQ ID N0:19, SEQ ID NO:20, 

20 SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:27, SEQ ID 
NO:30, SEQ ID N0:3 1, SEQ ID NO:32, and SEQ ID NO:33. A 
complementary sequence may include an antisense nucleotide. When the 
sequence is RNA, the deoxynucleotides A, G, C, and T of SEQ ID NO: 1 , SEQ 
ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO: 9, SEQ 

25 ID NO: 1 0, SEQ ID NO: 14, SEQ ID NO: 1 5, SEQ ID NO: 1 7, SEQ ID NO: 1 9, 
SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID 
NO:27, SEQ ID NO:30, SEQ ID N0:3 1, SEQ ID NO:32, and SEQ ID NO:33 
are replaced by ribonucleotides A, G, C, and U, respectively. Also included in 
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the invention are fragments of the above-described nucleic acid sequences that 
are and are at least 15 bases in length, which is sufficient to permit the 
fragment to selectively hybridize to DNA that encoded by SEQ ID NO: 1, SEQ 
ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO: 9, SEQ 
5 IDNO:10, SEQ ID NO:14, SEQ IDN0:15, SEQ IDN0:17, SEQ ID NO:19, 
SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID 
NO:27, SEQ ID NO:30, SEQ ID N0:31, SEQ ID NO:32, and SEQ ID NO:33 
under physiological conditions or a close family member of MINTl, MINT2, 
MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, MINT15, MINTl 7, 
1 0 MINTl 9, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 

MINT31, MINT32 and MINT33. The term "selectively hybridize" refers to 
hybridization under moderately or highly stringent conditions which excludes 
non-related nucleotide sequences. Hybridization conditions have been 
described above. 

15 

The MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, MINTIO, 
MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, 
MINT24, MINT27, MINT30, MINT31, MINT32, and MINT33 nucleotide 
sequence includes the disclosed sequence and conservative variations of the 

20 polypeptides encoded by MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, 
MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, 
MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 and MINT33 
polynucleotides. The term "conservative variation" as used herein denotes the 
replacement of an amino acid residue by another, biologically similar residue. 

25 Examples of conservative variations include the substitution of one 

hydrophobic residue such as isoleucine, valine, leucine or methionine for 
another, or the substitution of one polar residue for another, such as the 
substitution of arginine for lysine, glutamic for aspartic acid, or glutamine for 
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asparagine, and the like. The term "conservative variation" also includes the 
use of a substituted amino acid in place of an unsubstituted parent amino acid 
provided that antibodies raised to the substituted polypeptide also 
inununoreact vsdth the unsubstituted polypeptide. 

5 

MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, MINTIO, 
MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, 
MINT24, MINT27, MINT30, MINT31, MINT32 and MINT33 nucleic acid 
sequences can be expressed in vitro by DNA transfer into a suitable host cell. 

1 0 "Host cells" are cells in which a vector can be propagated and its DNA 

expressed. The cell may be prokaryotic or eukaryotic. The term also includes 
any progeny of the subject host cell. It is imderstood that all progeny may not 
be identical to the parental cell since there may be mutations that occur during 
repUcation. However, such progeny are included when the term "host cell" is 

15 used. Methods of stable transfer, meaning that the foreign DNA is 
continuously maintamed in the host, are known in the art. 

In one aspect, the MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, 
MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, 

20 MINT23, MINT24, MINT27, MINT30, MINT3 1, MINT32 and MINT33 

polynucleotide sequences may be inserted into an expression vector. The terai 
"expression vector" refers to a plasmid, virus or other vehicle known in the art 
that has been manipulated by insertion or incorporation of the sequence of 
interest genetic sequences. Polynucleotide sequence which encode sequence 

25 of interest can be operatively linked to expression control sequences. 

"Operatively linked" refers to a juxtaposition wherein the components so 
described are in a relationship permitting them to function in their intended 
manner. An expression control sequence operatively linked to a coding 
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sequence is ligated such that expression of the coding sequence is achieved 
under conditions compatible with the expression control sequences. As used 
herein, the term "expression control sequences" refers to nucleic acid 
sequences that regulate the expression of a nucleic acid sequence to which it is 
5 operatively linked. Expression control sequences are operatively linked to a 
nucleic acid sequence when the expression control sequences control and 
regulate the transcription and, as appropriate, translation of the nucleic acid 
sequence. Thus expression control sequences can include appropriate 
promoters, enhancers, transcription terminators, a start codon (/.e., ATG) in 

1 0 front of a protein-encoding gene, splicing signal for introns, maintenance of 
the correct reading frame of that gene to permit proper translation of mRNA, 
and stop codons. The term "control sequences" is intended to included, at a 
minimum, components whose presence can influence expression, and can also 
include additional components whose presence is advantageous, for example, 

15 leader sequences and fusion partner sequences. Expression control sequences 
can include a promoter. 

By "promoter" is meant minimal sequence sufficient to direct 
transcription. Also included in the invention are those promoter elements 

20 which are sufficient to render promoter-dependent gene expression 

controllable for cell-type specific, tissue-specific, or inducible by extemal 
signals or agents; such elements may be located in the 5' or 3* regions of the 
gene. Both constitutive and inducible promoters, are included in the 
invention (see, e.g. , Bitter et al. , Methods in Enzymology Ii2:5 1 6-544, 1987). 

25 For example, when cloning in bacterial systems, inducible promoters such as 
pL of bacteriophage y, plac, ptrp, ptac (ptrp-lac hybrid promoter) and the like 
may be used. When cloning in mammalian cell systems, promoters derived 
from the genome of mammalian cells {e.g,, metallothionein promoter) or from 
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mammalian viruses (e.g., the retrovirus long terminal repeat; the adenovirus 
late promoter; the vaccinia virus 7.5K promoter) may be used. Promoters 
produced by recombinant DNA or synthetic techniques may also be used to 
provide for transcription of the nucleic acid sequences of the invention. 

5 

In the present invention, the MINTl, MINT2, MINT4, MINT6, 
MINT8, MINT 9, MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 and 
MINT33 polynucleotide sequence may be inserted into an expression vector 

1 0 which contains a promoter sequence which facilitates the efficient 

transcription of the inserted genetic sequence of the host. The expression 
vector typically contains an origin of replication, a promoter, as well as 
specific genes which allow phenotypic selection of the transformed cells. 
Vectors suitable for use in the present invention include, but are not limited to 

1 5 the T7-based expression vector for expression in bacteria (Rosenberg et al , 
Gene 5^:125, 1987), the pMSXND expression vector for expression in 
mammalian cells (Lee and Nathans, 7. Biol Chem. 262:3521, 1988) and 
baculovirus-derived vectors for expression in insect cells. The DNA segment 
can be present in the vector operably linked to regulatory elements, for 

20 example, a promoter {e.g. , T7, metallothionein I, or polyhedron promoters). 



MINTl, MINT2, MINT4, MINT6, MINT8, MINT 9, MINTIO, 
MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, 
MINT24, MINT27, MINT30, MINT3 1, MINT32 and MINT33 polynucleotide 
25 sequences can be expressed in either prokaryotes or eukaryotes. Hosts can 
include microbial, yeast, insect and mammalian organisms. Methods of 
expressing DNA sequences having eukaryotic or viral sequences in 
prokaryotes are well knovm in the art. Biologically functional viral and 
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plasmid DNA vectors capable of expression and replication in a host are 
known in the art. Such vectors are used to incorporate DNA sequences of the 
invention. 

5 By itransformation" is meant a genetic change induced in a cell 

following incorporation of nev^ DNA {i.e,, DNA exogenous to the cell). 
Where the cell is a mammalian cell, the genetic change is generally achieved 
by introduction of the DNA into the genome of the cell (Le,, stable). 

1 0 By "transformed cell" is meant a cell into which (or into an ancestor of 

which) has been introduced, by means of recombinant DNA techniques, a 
DNA molecule encoding sequence of interest. Transformation of a host cell 
with recombinant DNA may be carried out by conventional techniques as are 
well known to those skilled in the art. Where the host is prokaryotic, such as 

1 5 E. colU competent cells which are capable of DNA uptake can be prepared 
from cells harvested after exponential growth phase and subsequently treated 
by the CaCb method using procedures well known in the art. Alternatively, 
MgCb or RbCl can be used. Transformation can also be performed after 
forming a protoplast of the host cell if desired. 

20 

When the host is a eukaryote, such methods of transfection of DNA as 
calcium phosphate co-precipitates, conventional mechanical procedures such 
as microinjection, electroporation, insertion of a plasmid encased in 
liposomes, or virus vectors may be used. Eukaryotic cells can also be 
25 cotransformed with DNA sequences encoding the sequence of interest, and a 
second foreign DNA molecule encoding a selectable phenotype, such as the 
herpes simplex thymidine kinase gene. Another method is to use a eukaryotic 
viral vector, such as simian virus 40 (SV40) or bovine papilloma virus, to 
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transiently infect or transform eukaryotic cells and express the protein (see for 
example, Eukarvotic Viral Vectors . Cold Spring Harbor Laboratory, Gluzman 
ed„ 1982). 

5 Isolation and purification of microbial expressed polypeptide, or 

fragments thereof, provided by the invention, may be carried out by 
conventional means including preparative chromatography and immunological 
separations involving monoclonal or polyclonal antibodies. 

10 In one embodiment, the invention provides substantially purified 

polypeptide encoded by MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, 
MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, 
MINT23, MINT24, MINT27, MINT30, MINT3 1 , MINT32 and MINT33 
polynucleotide sequences. The term "substantially purified" as used herein 

1 5 refers to a polypeptide which is substantially fi-ee of other proteins, lipids, 
carbohydrates or other materials with which it is naturally associated. One 
skilled in the art can purify a polypeptide encoded by MINTl, MINT2, 
MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, MINT15, MINTl 7, 
MINTl 9, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 

20 MINT3 1 , MINT32 and MINT33 polynucleotide sequence using standard 
techniques for protein purification. The substantially pure polypeptide will 
yield a single major band on a non-reducing polyacrylamide gel. The purity of 
the MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, 
MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, 

25 MINT27, MINT30, MINT3 1 , MINT32 and MINT33 polypeptide can also be 
determined by amino-terminal amino acid sequence analysis. 
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Minor modifications of the MINTl, MINT2, MINT4, MINT6, MINTS, 
MINT 9, MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINT31. MINT32, and 
MINT33 primary amino acid sequences may result in proteins which have 
5 substantially equivalent activity as compared to the unmodified counterpart 
polypeptide described herein. Such modifications may be deliberate, as by 
site-dh:ected mutagenesis, or may be spontaneous. All of the polypeptides 
produced by these modifications are included herein as long as the biological 
activity still exists. 

1.0 . 

The polypeptides of the invention also include dominant negative 
forms of the MINTl , MINT2, MINT4, MINT6, MINT8, MINT 9, MINTl 0, 
MINT14, MINT15, MINTl 7, MINT19, MINT20. MINT22, MINT23, 
MINT24, MINT27, MINT30, MINT3 1, MINT32 or MINT33 polypeptide 

1 5 vAuch do not have the biological activity of MINTl , MINT2, MINT4, 

MINT6, MINTS, MINT 9, MINTIO, MINT14, MINT15, MINT17, MINT19, 
MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, 
MINT32 or MINT33 polynucleotide sequence. A "dominant negative form" 
of MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, 

20 MINT15, MINTl 7, MINT19, MINT20, MINT22, MINT23, MINT24, 

MINT27, MINT30, MINT3 1, MINT32, or MINT33 is a polypeptide that is 
structurally similar to MINTl, MINT2. MINT4, MINT6, MINTS, MINT 9, 
MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, 
MINT23, MINT24, MINT27, MINT30, MINT3 1 , MINT32 or MINT33 

25 polypeptide but does not have wild-type MINTl , MINT2, MINT4, MINT6, 
MINTS, MINT 9, MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 or 
MINT33 function. For example, a dominant-negative MINTl, MrNT2, 
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MINT4, MINT6, MINT8, MINT 9, MINTIO, MINT14, MINT15, MINT17, 
MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 
MINT3 1, MINT32 or MINT33 polypeptide may interfere with wild-type 
MINTl, MINT2, MINT4, MINT6, MINTS, MINT 9, MINTIO, MINT14, 
5 MINT! 5, MINTl 7, MINT! 9, MINT20, MINT22, MINT23, MINT24, 

MINT27, MINT30, MINT3 1, MINT32 or MINT33 function by binding to, or 
otherwise sequestering, regulating agents, such as upstream or downstream 
components, that normally interact functionally with the MINTl, MINT2, 
MINT4, MINT6, MINT8, MINT 9, MINTIO, MINT14, MINT15, MINT17, 
1 0 MINTl 9, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 
MINT3 1 , MINT32 or MINT33 polypeptide. 

EXAMPLES 

The following examples are intended to illustrate but not to limit the 
1 5 invention in any manner, shape, or form, either explicitly or implicitly. While 
they are typical of those that might be used, other procedures, methodologies, 
or techniques known to those skilled in the art may alternatively be used. 

EXAMPLE 1 

20 DETECTION OF METHYLATED CPG ISLANDS USING MCA 

The principle xmderlying MCA involves amplification of closely 
spaced methylated Smal sites to enrich for methylated CGIs. The MCA 
technique is outlined in Figure 1 A. About 70 to 80% of CpG islands contain 
at least two closely spaced (<lkb) Smal sites (CCCGGG). Only those Smal 

25 sites within these short distances can be amplified using MCA, ensuring 

representation of the most CpG rich sequences. Briefly, DNA is digested with 
Smal, which cleaves only unmethylated sites, leaving blunt ends between the 
C and G, DNA is then digested with the Smal isoschizomer Xmal, which 
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does cleave methylated CCCGGG sites, and which leaves a 4 base overhang. 
Adaptors are ligated to this overhang, and PGR is performed using primers 
complementary to these adaptors. The amplified DNA is then spotted on a 
nylon membrane and can be hybridized with any probe of interest. 

5 

As a model experiment, amplification of the pi 6 gene CGI was 
examined because (1) hypermethylation of this CGI in cancer is well 
characterized, and correlates with silencing of the gene (Herman et al, 1995), 
and (2) this CGI contains two closely spaced Smal sites (400bp) which can be 

10 amplified by MCA. Initially, the reaction was optimized by testing different 
primers with a variable GC content, and different PCR conditions. As shown 
in Figure IB, using primers with a 70% GC content, the pi 6 CGI is amplified 
strongly in the Caco2 cell line, where it is known to be hypermethylated, while 
no signal above background was detected firom any normal colon mucosa. To 

1 5 examine the quantitative aspect of MCA, DNA from Caco2 and normal colon 
mucosa were mixed in various proportions, and the methylation level of each 
mix was determined using MCA. MCA detected pl6 methylation m a semi- 
quantitative manner between 1% and 100% methylated alleles. Finally, MCA 
was performed on 109 samples of normal colonic mucosa and adjacent 

20 primary colorectal tumor that had previously been typed for pi 6 methylation 
by Southern blot analysis (Ahuja et al, 1997). MCA and Southern blot were 
concordant in 107/109 (98%) of the cases. In one case, MCA detected a low 
level of methylation (5-10 %) in a cancer sample that had been judged 
negative by Southern blot. In the other discordant case (positive by MCA, 

25 negative by Southern blot), the discordance may be related to heterogeneous 
pi 6 methylation, as has been described (Costello et al, 1996). 
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MCA is a novel PCR-based technique that allows for the rapid 
enrichment of hypermethylated CG rich sequences, with a high representation 
of methylated CpG islands. This technique can have several potential 
applications. MCA is very useful for the deteraiination of the methylation 

5 status of a large number of samples at multiple loci simultaneously. By 

optimizing the PGR conditions, it should be readily adaptable to the study of 
the methylation status of any gene that has two closely spaced Smal sites. As 
shown herein, there is a very high concordance rate between MCA and other 
methods for the detection of hypermethylation such as Southern blot analysis 

1 0 and bisulfite-based methods. However, MCA (1) requires good quality DNA, 
excluding the study of paraffin-embedded samples, (2) examines only a 
limited number of CpG sites within a CGI and (3) is sensitive to incomplete 
digestion using the methylation-sensitive enzyme Smal, Nevertheless, many 
steps in MCA are amenable to automation and, by allowing for the 

1 5 examination of multiple genes relatively quickly, may have important 
applications in population-based studies of CGI methylation. 



EXAMPLE 2 

TPFNTTFTCATION O F DTFFERENTTALLY METHYLATED CG 

20 TN CRC PYMCA/RPA 

To identify novel CGIs aberrantly methylated in CRC, RDA (Lisitsyn 
et al, 1993) was performed on MCA amplicons from the colon cancer cell 
line Caco2 as a tester, and a mixture of DNA from the normal colon mucosa of 
5 different men (to avoid clonmg polymorphic Smal sites or inabtive and 
25 methylated X chromosome genes from women) as a driver. Two separate 

experiments were conducted, one using a lower annealing temperature (72/C), 
and the other using a higher annealing temperature (77/C) and more GC rich 
primers. After two rounds of RDA, the PGR products were cloned, and 
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colonies containing inserts were identified by PGR. Based on initial 
experiments, we expected most of the recovered clones to contain Alu 
repetitive sequences, which are CG rich and hypermethylated (Kochanek et 
al, 1993). All clones were therefore probed with an Alu fragment, and only 
5 non-hybridizing clones were analyzed further. Out of 160 non-Alu clones, 46 
were independent clones and 33 of these (MINT! -33, Methylated in JEumors, 
SEQ ID NOs:l-33, respectively) appeared to be differentially methylated in 
Caco2 cells by comparing hybridization to MCA products from Caco2 and 
normal colon (Figure IC), 19 of the clones (MINTl-19) were obtained using 
10 the lower aimealing temperature, and 14 (MINT 20-33) using the higher 
temperature. 

To confirm the aberrant methylation of these clones, Southern blot 
analysis was performed using DNA digested with Smal or Xmal. All of the 

15 33 clones were hypermethylated in Caco2 compared to normal colon mucosa. 
Of these 33, one clone (MINT13) detected highly repeated sequences and two 
clones (MINT18 and MINT28) appeared to correspond to mildly repeated 
gene families (data not shown). All others appeared to detect single copy 
DNA fragments. In addition, hypermethylation at CpG sites v/ithin the clones 

20 and distinct from the Smal sites was confirmed by bisulfite-PCR for 6 clones. 
In each case, Caco2 was found to be hypermethylated at these sites. 

By DNA sequencing (example shown in Figure 2), we found that 29 
clones had a GC content greater than 50%, and satisfied the minimal criteria 
25 for CGIs (200bp, GC content>50%, CpG/GpOO.5) (Gardiner-Garden and 
Frommer, 1987). As might be expected, clones obtained with the higher 
annealing temperature and more GC rich primers had a relatively higher GC 
content (Table 1). The size of each clone, percentage of GC nucleotide. 
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observed/expected CGs, sequence homology and, chromosomal location are 
summarized in Table L MINTS, MINT8, MINTl 1, MINT14 and MINT16 
contained GC rich regions only in one end of the clones, and these may have 
been recovered from the edge of CGIs. 
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Table 1 : Summary of the 33 Differentially Methylated Clones Isolated by MCA-RDA 
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591 


58 


0.8 


Yes 


CpG clone 73el 


7qll 


Type A 


MINT27 


242 


74 


0.7 


Yes 


None 


N.D 


TypeC 


MINT28 


463 


58 


1 


Yes 


Ribosomal RNA gene 


N.D 


Type A 


MINT29 


429 


60 


0.7 


Yes 


CpG clone 20b 1 


7qll 


N.D 


MINT30 


536 


65 


0.5 


Yes 


None 


20qll 


Type A 


MINT31 


673 


65 


0.8 


Yes 


None 


17q21 


TypeC 


MINT32 


464 


66 


1 


Yes 


None 


20ql3 


Type A 


MINT33 


139 


65 


0.8 


Yes 


None 


N.D 


N.D 



0/E : Observed/expected numbers of CpGs. N.D : not determined. 
* Only one portion of the clones has a CpG island. 
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By DNA homology search using the BLAST program (BLAST 2.0, 
default parameters, see http://www,ncbi.rilm.nih.gov/cgi-bin/BLAST/nph- 
newblast?J-form=0), 4 clones were identical to human gene sequences, four 
clones were identical to CGIs randomly sequenced from a CGI library (Cross 
5 et al , 1 994), one was identical to an EST, two clones were identical to high 
throughput genomic sequences deposited in Genbank, three clones had 
significant homology to other genes arid the other 19 had no significant match 
in the database; MINT! 1 was identical to exon 1 and intron 1 of the human 
versican gene (Zinmierman et al, 1989), and corresponded to the 3' edge of a 

1 0 promoter associated CGI; MINT14 was identical to exon 1 of the human 

alpha-tubulin gene (Dobner, P.R., et al, 1987), and was also the 3' edge of the 
CGI; MINT 24 corresponded to the 3' noncoding region of the human 
homeobox gene CSY(Turbay et al, 1996); MINT21 had a region with 94% 
homology at the nucleotide level to exon 2 of the mouse OPrgene (Simeone 

15 etal,\ 994) and probably represents the human homologue of this gene; 
MINT28 was homologous to ribosomal gene sequences; MINT18 was 
homologous to the acrogranin gene family. To examine the presence of 
potential promoter sequences in these clones, promoter prediction was 
performed using several computer programs (see programs available at 

20 http://dot.imgen.bcm.tmc,edu:9331/seq-search/gene-search,html). Twenty out 
of the 33 clones were predicted as promoters using the NNPP program, and 6 
were predicted as promoters by using the TSSG program. 

The chromosomal position of most of the unknown clones was 
25 determined using a somatic cell hybrid panel and a radiation hybrid panel 
(Table 1). Of note, MINT3 and MINT9 mapped to chromosome lp35-36, 
MINT13 mapped to 7q31, MINT24 mapped to 3p25-26, MINT25 mapped to 
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22ql l-Ter, and MINT31 mapped to 17q21. All of these chromosomal 
segments are areas that are frequently deleted in various tumors. 

An important application of MCA is in the discovery of novel genes 
5 hypermethylated in cancer. As demonstrated here, MCA coupled v^th RDA is 
a rapid and powerful technology for this purpose, and compares favorably 
with other described techniques (Hayashizaki et al, 1994, Gonzalgo et aL, 
1997; Huang et al, 1997). In addition to the identification of genes 
hypermethylated in cancer, MCA could potentially be used to discover novel 
1 0 imprinted genes using parthenogenetic DNA (Kaneko-Ishino et al , 1 995), as 
well as novel X-chromosome genes. 

EXAMPLE 3 

15 SILENCING O ff THE. VKRSICAN GY.m. IN CRC 

To determine whether some of these clones truly represented genes 
silenced by methylation, we examined the versican gene in more detail. 
Versican is a secreted glycoprotein that appears to be regulated by the Rb 
tumor suppressor gene (Rohde et al, 1996). MINTl 1 corresponds to part of 

20 exon 1 and part of intron 1 of the versican gene (Figure 3 A). Hypermethyla- 
tion of the two Smal sites in exon 1 and intron 1 in colon cancer cell lines was 
confirmed by both Southem blot analysis and MCA. In order to determine if 
this methylation was representative of the entire CGI, including the proximal 
promoter, PCR was performed on bisulfite-treated DNA using primers 

25 designed to amplify the region around the transcription start site of this gene. 
The PCR product was then digested with restriction enzymes that distinguish 
methylated from unmethylated DNA. The versican promoter was found to be 
completely methylated in the colon cancer cell lines, DLDl, LOVO, SW48 
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and SW837, and partially methylated in HCTl 16 and HT29 (Figure 3B). In 
primary colon tumors, versican was hypermethylated in 17 out of 25 cases 
(68%). Interestingly, some methylation of the versican promoter was also 
found in normal tissues, albeit at lower levels when compared to tumors. The 
5 level of methylation in nomial colon mucosa increased with age of the patient 
(Fig3C), from an average of 6.9% in patients between 20 and 30 years of age, 
to an average of 28.9% in patients over 80. A linear regression analysis 
revealed a significant association between age and versican promoter 
methylation (R=0.7, P<0.000001). Using RT-PCR, we next examined the 

10 expression of versican in nomial colon mucosa and CRC cell lines. Versican 
was found to be expressed in normal colon epithelium, but was markedly 
down-regulated or absent in methylated colon cancer cell lines. Expression of 
versican in all these cell lines was easily restored after treatment with the 
demethylating agent, 5-aza-deoxycytidine. These data suggest that versican 

1 5 becomes methylated in normal colon in an age-dependeiit maimer, and that 
this leads to hypermethylation and loss of expression in most colorectal 
tumors. 

Using MCA/RDA 33 differentially methylated clones were identified 
20 and characterized in detail. By sequencing, we found that 29 out of the 33 
clones satisfy the criteria of CpG islands, demonstrating that MCA can 
represent CGIs specifically. Of these 29 clones 5 were abeady known genes 
(versican, alpha-tubulin, CSX, OPT homologue and ribosomal RNA gene). 
Of these, versican is most interesting in that this proteoglycan is an Rb 
25 inducible gene (Rohde et al, 1996), suggesting that down regulation of this 
gene product may have an important role in colorectal carcinogenesis, where 
Rb mutations are rare. The data clearly show that aberrant methylation of the 
versican gene promoter is correlated with silencing of this gene. In addition, 
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methylation of the alpha-tubulin gene in Caco2 is consistent with the results of 
studying the gene expression profile of colorectal cancers using SAGE (Zang 
et al, 1997), which demonstrated that alpha-tubulin is markedly 
down-regulated in CRC. Methylation of the CSX and OPT genes does not 

5 coincide with their 5' end, and is therefore not expected to silence these genes. 
It is possible, however, that these CpG islands are associated with alternate 
transcripts of the genes, or with other nearby genes, which would then be 
silenced by methylation (Wutz, A., et aL, 1997). Finally, methylation of 
ribosomal genes has previously been seen in aging tissues (Swisshelm, K., et 

10 a/. , 1 990) and therefore is not surprising to find in cancers. Because some of 
the clones recovered are in the exon 1 region of expressed genes, identification 
of new tumor suppressor genes might be facilitated by using MCA/RDA 
clones as probes for screening cDNA library. Indeed, based on their 
chromosome location, several clones map to chromosomal regions thought to 

1 5 harbor TSGs because they are highly deleted in various tumors (e.g. , 
chromosome lp35, 3p25-26, 7q31, 17q21 and22qll-Ter). 

EXAMPLE 4 

TWO TYPES OF METHYT.ATION IN CRC 

By examining the methylation status of several known genes in 
colorectal tumors, it has been previously demonstrated that some genes tend to 
be methylated in an age-dependent maimer in normal colon (Issa et aL, 1994), 
and are frequently methylated in CRC, while others are methylated in cancers 
exclusively (Ahuja et aL , 1 997). To examine this issue on a genome wide 
level in some detail, the methylation profile of 3 1 MINT clones in a panel of 
colorectal tumors and corresponding normal colon mucosa was examined 
using MCA (two clones could not be accurately studied because of high 
background (MINT29) or small size (MINT33)). Because all of theses clones 
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were recovered from a CRC cell line, there was an initial concern that many of 
these were not representative of methylation in primary (uncultured) tumors. 
However, of the 3 1 clones, 29 were also foimd to be methylated in some 
primary CRC. The two clones methylated only in the cell line Caco2 were (1) 
5 MINT14, a LINE element, and (2) MINTl 8, a sequence that had a very low 
CpG frequency and did not qualify as a CGI. Thus, all non-repetitive CGIs 
recovered were methylated in primary CRC as well as cell lines. 
Hypermethylation patterns of these 29 clones fell into two distinct categories. 
A majority of the clones (22 out of 29) were found to be frequently methylated 

10 (>70%) in the tumors tested, and a slight amount of methylation was also 
detected in normal colon mucosa. For all of these clones, the normal colon 
mucosa obtained from young patients showed less methylation compared to 
the normal mucosa from older patients (Figure 4B). Thus, the majority of 
CGIs hypermethylated in CRC are methylated in normal colon mucosa as 

15 well, in an age related manner. This methylation was named Type A for 
aging-specific methylation. 

The remaining 7 clones were methylated exclusively in CRC, and their 
frequency of methylation was significantly lower than type A methylation 
20 (ranging from 1 0% to 50%). This type of methylation was named type C for 
cancer-specific. 
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Recently, several reports have suggested that aberrant methylation of 
CGIs may play an important role in cancer development (Baylin et al, 1998; 
Jones 1997). However, there is little integrated information on aberrant CGI 
methylation in cancer at multiple loci, probably because of the lack of a 
5 method to detect methylation in a large number of samples for unselected 
CGIs throughout the genome. Furthermore, it has been shown that cultured 
cell lines have a high degree of CGI methylation (Antequera et al, 1990) but it 
was not known to what extent this reflects methylation in primary cancers. To 
address these issues, the relatively quantitative and high output features of 
1 0 MCA allowed us to determine the methylation profile of 3 1 differentially 
methylated loci in a panel of colorectal carcinomas. 

Despite the fact that all sequences were initially recovered from a 
colon cancer cell line, only 2 out of the 31 clones showed cell line restricted 

15 methylation. From the sequence data, one of these two clones was a repeated 
sequences (LINEl), and the other was not a CGI. Thus most of the single 
copy clones recovered proved to be methylated not only in cell lines but also 
in some primary colon cancers. Analysis of these 29 clones revealed two 
distinct types of hypermethylation in cancer (Type A for agmg and Type C for 

20 cancer), which may have distinct causes, and different roles in cancer 

development. Type A methylation was seen in the majority of these clones: 
22 of 29 (74%) clones were methylated in an age-related maimer in normal 
colon tissue, and hypermethylated at a high frequency in CRC, as we have 
shown for the ER gene (Issa et ai, 1994) and others (Issa et al, 1996; Ahuja et 

25 a/., submitted). These results suggest that a large number of CGIs in the 

human genome are incrementally methylated during the aging process and, for 
many genes, this methylation correlates with reduced gene expression as 
shown for ER (Issa et al, 1994) and versican. Although the mechanism of 
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Type A methylation is unknown, it is likely to result from physiological 
processes rather than a genetic alteration because (1) it is very frequent and 
affects large numbers of cells, (2) it is present in all individuals, not just 
patients with cancer and (3) this process is gene and tissue specific (Ahuja 
5 et al, submitted). Because the methylation status at a given CGI is thought to 
be related to positive (methylator) factors (Munmianeni et al, 1993; 
Mummaneni et al, 1995; Magewu and Jones, 1994, Vertino et al, 1996) and 
negative (protector) factors (Macleod et al, 1994; Brandeis et al, 1994; 
Turker and Bestor, 1997; Chen et ai, 1997), it is possible that for some genes, 
10 this balance favors slightly de-novo methylation, and that this is reflected by 
progressive hypermethylation after repeated cell divisions. 

EXAMPLES 
GLOBAL HYPERMKTHYLATION TN CRC 

15 

To understand the patterns of cancer-specific methylation in CRC, the 
methylation status of all 7 type C clones was analyzed, as well as pi 6 in 
primary cancers and polyps (Figure 4). Two of these clones (MINTl and 
MINT2) were studied by both MCA and bisulfite-PCR, and the concordance 
20 between the two techniques was found to be 98%. PI 6 was studied by both 
MCA and Southern blot, with a concordance rate of 98%. When we 
considered the six clones that were methylated in more than 10% of the cases, 
as well as p75, a remarkable pattern emerged (summarized in Figure 5 and 
Table 2). 
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The 50 CRC fell into two distinct groups: (1) A group with a high level 
of Type C methylation, whereby all the tumors had methylation of 4 or more 
loci simultaneously and (2) a group where methylation of any type C clone is 
5 extremely rare. Thus, the first group of tumors appears to display profound 
global hypermethylation (GH+), which is lacking in the second group (GH-). 
Interestingly, there was a great concordance between methylation of the /775 
gene, which was not selected for by our cloning process, and the presence of 
GH. In sharp contrast, Type A methylation was not significantly different 
1 0 between GH+ and GH- tumors (Table 2). 

GH was also detected in a subset of colorectal adenomas (Figure 5), 
suggesting that it is an early event in carcinogenesis. Interestingly, while 5 of 
5 small adenomas (<7 mm) were GH-, 6 of 9 large adenomas (>10 mm) were 

1 5 GHH-, suggesting that this defect may be acquired in the transition between 
small and large adenomas. In 6 cases, both an adenoma and a cancer from the 
same patients were examined. In one of these, GH was detected both in the 
adenoma and the cancer; in 3 cases, GH was detected in the cancer but not in 
the adenoma and in 2 cases, GH was detected in neither the adenoma nor the 

20 cancer. 

By contrast to type A methylation, type C methylation is relatively 
infirequent in primary CRC, and is never observed in normal colon mucosa. 
Furthermore, detailed analysis of type C methylation in CRC revealed a 
25 striking pattern, suggesting the presence of global hypermethylation in a 
subset of these tumors: GH positive cases are characterized by frequent and 
concordant methylation of all type C clones examined, such that each tumor 
has at least four methylation events. By contrast, type C methylation is 
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virtually non-existent in tumors without GH. This concordance cannot be due 
to simple experimental variation or artifacts because (1) methylation was 
verified using separate methods (MCA, bisulfite-PCR and Southem blots), (2) 
the concordance was not limited to MCA/RDA derived clones since it also 

5 affected the pl6 (Herman et al, 1995) and hMLHl (Kane et al, 1997) genes, 
and (3) there was no significant difference in type A methylation between 
GH+ and GH- tumors. Global hypermethylation appears to be an early event 
in the development of CRC, being detectable in large pre-neoplastic 
adenomas. Because many genes are potential candidates for inactivation 

10 through promoter methylation (Baylin et al, 1998; Jones, 1996), global 
hypermethylation may have profound pathophysiologic consequences in 
neoplasia through the simultaneous inactivation of tumor-suppressor genes 
(such as pi 6), metastasis-suppressor genes (such as E-cadherin), angiogenesis 
inhibitors (such as Thrombospondin-1) and others. In fact, our data suggest 

1 5 that global hypermethylation could also result in mismatch repair deficiency 
through methylation and inactivation of the hMLHl promoter, and may 
explain up to 75% of cases of sporadic CRC with microsatellite instability. 
The causes of type A and type C methylation are probably different because 
the latter is detected only in a limited number of cases, and the genes affected 

20 are different. Because of the remarkable concordance in type C methylation 
among GH+ cases, it appears likely that these tumors all share a specific 
defect in the maintenance of the methylation-free state in CGIs. This defect 
could be either aberrant de-novo methylation (through a mutation in 
DNA-methyltransferase for example), or loss of protection against de-novo 

25 methylation, through the loss of a trans-activating factor (Macleod et al , 1 994; 
Chen et al, 1997). Because DNA-methyltransferase activity is similar in the 
two groups, the latter hypothesis is more likely. Thus, at least in colorectal 
cancer, it appears likely that type C methylation (an epigenetic error) is 
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actually caused by a genetic event that results in an increased chance of 
methylating a subset of CGIs. Ironically, this epigenetic defect may then 
result in additional genetic lesions through the induction of mismatch-repair 
deficiency. 



EXAMPLE 6 

MTCROSATELLITE INSTABILITY IS LINKED TO 
GLOff AT. HYPERMRTHYLATION TN CRC 

10 In a previous study (Ahuja et al , 1 997), a link was reported between 

microsatellite instability and a hypermethylator phenotype in sporadic CRC. 
Relatively few mutations in mismatch repair genes have been reported in 
sporadic MI+ cancers, but hMLHl methylation has recently been observed in 
some cases (Kane et al, 1997). To determine the relation between global 

1 5 hypermethylation and microsatellite instability in CRC, we measured hMLHl 
methylation using bisulfite/PCR in our panel of CRC which had also been 
previously typed for the presence of microsatellite instability (Figure 5). 
hMLHl was studied by bisulfite-PCR only because it does not have 2 Smal 
sites in its CGL Overall, 16 out of 50 (32%) cancers had evidence of 

20 microsatellite instability. Among the 29 GH+ cases, 12 had evidence of 
hMLHl methylation, suggesting that hMLHl is one of the targets of global 
hypermethylation in CRC. All of these 12 tumors had microsatellite 
instability. By contrast, hMLHl methylation was detected in only one of the 
21 GH- cases. These data establish a strong link between the GH phenotype, 

25 hMLHl methylation and microsatellite instability in CRC. Two lines of 

evidence suggest that microsatellite instability may follow, and be caused by, 
global hypermethylation and HMLHl methylation. First, GH is detectable in 
about half of colonic adenomas, but none of these tumors have hMLHl 
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methylation, and microsatellite instability is extremely rare in this pre- 
neoplastic lesion (Samowitz and Slattery, 1997). Second, GH is not simply 
caused by mismatch repair defects because microsatellite instability is absent 
in more than half of the GH+ cases, and GH was absent in 4 of the 16 cancers 
5 with microsatellite instability. Overall, our data suggest that, in sporadic 
CRC, the majority (12 out of 16, or 75%) of cases with microsatellite 
instability may be caused by GH followed by hMLHl methylation, loss of 
hMLHl expression and resultant mismatch repair deficiency (Herman et al, 
submitted). 

10 

Based on these data, the following model has been developed 
integrating CGI methylation into CRC development (Figure 6). In this model, 
CGI methylation plays two distinct roles, and appears to arise through distinct 
mechanisms. Initially, type A methylation arises as a function of age in 

15 normal colorectal epithelial cells. By affecting genes that regulate the growth 
and/or differentiation of these cells, such methylation results in a 
hyperproliferative state, which is thought to precede tumor formation in the 
colon (reviewed by Lipkin, 1988). Such hyperproliferation is known to arise 
with age in colorectal epithelium (HoU et al, 1988, Roncucci et al, 1988), and 

20 to be marked in patients with CRC. The cause of type A methylation is 

imknown, but without being bound by theory it is possible that it is related to 
endogenous factors inherent to the structure of DNA, and that it may be 
modulated by factors such as level of ongoing expression and exposure to 
carcinogenic insults. Furthermore, modulation of type A methylation may 

25 provide one possible explanation for the reduction in CRC tumorigenesis by 
reducing levels of DNA-methy transferase (Laird et al, 1995). 
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A second major role for CGI methylation appears later, perhaps at the 
transition between small and large adenomas in the colon. This methylation 
(type C) affects only a subset of tumors, which then evolve along a pathway of 
global hypermethylation. This GH leads to cancer development through the 
5 simultaneous inactivation of multiple timior-suppressor genes such as pi 6, and 
induction of mismatch repair deficiency through inactivation of hMLHl . The 
cause of this global hypemiethylation is unknown, but may well be related to 
inactivation of a gene that protects CGIs from de-novo methylation. Finally, 
we propose that tumors without GH evolve along more classic genetic 
10 instability pathways, including chromosomal instability (Lengauer et al^ 
1997 A). Interestingly in this regard, Lengauer et al found an inverse 
correlation between chromosomal instability and MMR deficiency in CRC 
cell lines (Lengauer et al, 1997B). 

1 5 While based on CRC, this model is applicable to most human 

malignancies. In evidence has also been found for type A and type C 
methylation in brain tumors (Li et al, 1998). Preliminary evidence also 
suggests the presence of global hypermethylation in multiple types of cancers, 
including stomach cancers, brain tumors and hematopoietic malignancies. 

20 

In conclusion, a novel method, MCA, has been developed to 
selectively amplify methylated CGIs. Using MCA/RDA 33 differentially 
methylated clones in CRC were isolated. The methylation profile of these 
clones revealed that nearly all methylation in CRC can be accoimted for by (1) 
25 age-related methylation and (2) a hypermethylator phenotype presumably 

caused by global hypermethylation. Deciphering the mechanisms underlying 
these phenomena should facilitate the early detection, prevention and therapy 
of cancers, including colorectal cancers. 
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EXAMFLE 7 

IDENTIFICATION OF CACNAIG AS A TARGET FOR 
HYPERMETHYLATION ON HUMAN CHROMOSOME 17a21 

5 

To identify genes differentially methylated in colorectal cancer, 
methylated CpG island amplification was used followed by representational 
difference analysis (Razin and Cedar, Ce//lZ: 473-476, 1994, herein 
incorporated by reference). One of the clones recovered (MINT3 1, see above) 

10 mapped to human chromosome 17q21 using a radiation hybrid panel, and a 
Blast search revealed this fragment to be completely identical to part of a BAG 
clone (Genbank: AC004590) sequenced by high throughput genomic 
sequence. The region surrounding MINTS 1 fulfills the criteria of a CpG 
island: GC content 0.67, CpG/GpC ratio 0.78 and a total of 305 CpG sites in a 

15 4 kb region. Using this CpG island and 10 kb of flanking sequences in a Blast 
analysis, several regions highly homologous to the rat T-type calcium chaimel 
gene, CACNAIG, were identified (Perez-Reyes et al., Nature 221: 896-900. 
1998, herein incorporated by reference). Several ESTs were also identified in 
this region. Using Genscan, 2 putative coding sequences (Gl, and G2) were 

20 identified. Blastp analysis revealed that Gl has a high homology to the EH- 
domain-binding protein, epsin, while G2 is homologous to a C-elegans 
hypothetical protein (accession No. 2496828). 

The MINT31 CpG island corresponds to the 3* regions of Gl and G2, 
25 based on the direction of the open reading frame and the presence of a poly A 
tail, and is unlikely to influence their transcription. The EST closest to 
MINT31 (H13333) was sequenced entirely and was found not to contain a 
continuous open reading frame, but a poly-adenylation signal was identified 
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on one end, along with a poly A tail. These data suggest that HI 3333 
corresponds to the last 2 exons of an unidentified gene. MINT31 is in the 
intron of this gene and is unlikely to influence its transcription. However, 
based on both promoter prediction (TSSG) analysis of this region and 
5 homology to the rat CACNAIG sequence, the MINT3 1 CpG island is also in 
the 5' region of human CACNAIG gene and may play a role in its 
transcriptional activity. 

The human CACNAIG sequence deposited in Genbank lacks the 5' 

10 region of the gene, when compared to the rat homologue. To determine the 5' 
region of human CACNAIG, we amplified cDNA by RT-PCR using primers 
based on the BAG sequence (Genbank: AC004590, herein incorporated by 
reference). The PGR products were cloned and sequenced, and the genomic 
organization of the gene was determined by comparing the newly identified 

15 sequences as well as the known sequences to the BAG that covers this region. 
CACNAIG is composed of 34 exons which span a 70 kb area. Based on 
sequences deposited in Genbank, the gene has two possible 3* ends caused by 
alternate splicing. CACNAIG is highly homologous to rat CACNAIG with 
93% identity at the protein level, and 89% identity at the nucleotide level. The 

20 5* flanking region of CACNAIG lacks TATA and CAAT boxes, which is 
similar to many housekeeping genes. A putative TFIID binding site was 
identified 547-556 bp upstream firom the translation start site, and several 
other potential transcription factor binding sites such as API (1 site), AP2 (2 
sites) and SPl (10 sites), were identified upstream of CACNAIG exon 1 using 

25 the promoter prediction program, TESS (data not shown). 

The CACNAIG GpG island is 4 kb, and is larger than many typical 
GpG islands, MINT31 corresponds to the 5' edge of the island while 
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CACNAIG is in the 3* region. It is not known whether large CpG islands such 
as this are coordinately regxiiated with regards to protection from methylation, 
and aberrant methylation in cancer. To address this issue, the methylation 
status of the 5' region of CACNAIG was studied using bisulfite-PCR of DNA 
5 from normal tissues as well as 35 human cancer cell lines from colon, Ixmg, 
prostate, breast and hematopoietic tumors. The CpG island was divided into 8 
regions and their methylation status was examined separately. The genomic 
DNA was treated with sodium bisulfite and PGR amplified using primers 
containing no or a minimum number of CpG sites. Methylated alleles were 
1 0 detected by digesting the PGR products using restriction enzymes which 

specifically cleave sites created or retained due to the presence of methylated 
CpGs. None of the regions was methylated in normal colon, consistent with a 
imiform protection against de-novo methylation. 

15 Regions 1 and 2 were frequently methylated in cancer cell lines, and 

behaved in a concordant manner. These 2 regions were methylated in most 
cancer cell types except gliomas, and most cell lines where methylation was 
found methylated both regions simultaneously. Region 3, which is less GG 
rich than any of the other regions, had either no methylation or very low levels 

20 of methylation in most cell lines. Regions 5, 6, and 7 behaved quite 

differently compared to 1-3. Methylation of these regions was less frequent 
than regions 1-2, as 22/35 cell lines had no detectable methylation there, 
despite often showing methylation of region 1-2. However, when methylation 
was present (in 13/35 cell lines), it affected all 3 regions simultaneously, 

25 although to varying extents. Finally, regions 4 and 8 behaved differentially 
again, being partially methylated primarily in colon and breast cell lines. 
Therefore, with regards to hypermethylation in cancer, the CpG rich region 
upstream of CACNAIG appears to be composed of 2 CpG islands which 



wo 00/26401 



PCT/US99/25251 - 



-61- 

behave independently. ME^3 1 corresponds to the upstream CpG island 
(island 1 , regions 1 and 2), while the 5' region of CACNAIG is contained in 
the downstream CpG island (island 2, regions 5-7). Regions 3, 4 and 8 
correspond to the edge of these CpG islands, and behave a little differentially 
5 than the hearts of the CpG islands, as previously described for the E-Cad gene 
(Graff, et aL, 1 Biol Chem, 212: 22322-22329, 1997). 

Overall, the methylation patterns fell into 5 distinct categories: (1) No 
methylation in any region (normal tissue). (2) Slight methylation of island 1 

1 0 (6 cell lines, see for example TSU-PRL in Fig. 2). (3) Heavy methylation of 
island 1 but no methylation of island 2(16 cell lines, see for example Caco2 in 
Fig. 2). (4) Heavy methylation of island 1 and moderate to heavy methylation 
of island 2 (6 cell lines, see for example RKO and Raji in Fig. 2). (5) High 
methylation of island 1 and low to moderate methylation of island 2 (7 cell 

1 5 lines, see for example MB-23 1 in Fig. 2), 

In a previous study of rat CACNAIG^ this gene was shown to be 
expressed most abundantly in the brain (Perez-Reyes et al., Nature 321: 896- 
900. 1998). To determine the expression of CACNAIG in normal and 

20 neoplastic human cells, RT-PCR was performed using cDNA from various 
normal tissues and from a panel of 27 tumor cell lines. CACNAIG was 
expressed ubiquitously in a variety of tissues and cell lines. In normal tissues 
expression was relatively low but easily detectable, while most cell lines had 
relatively high expression of CACNAIG. However, some cell lines had 

25 negligible or totally absent levels of CACNAIG expression. The results of 
CACNAIG expression was correlated with the detailed methylation analysis 
previously described. In this analysis, a remarkable pattern emerged. 
Methylation of region 1-4 and 8 had no effect on CACNAIG expression. 
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However, there was a strong correlation between methylation of regions 5-7 
and expression of the gene. In fact, all cell lines that lack methylation of this 
region strongly express the gene. All 6 cell lines with pattern 4 methylation 
studied had no detectable expression. Finally, the 7 cell lines with pattern 5 
5 methylation (examples DLD-1 and MB-453) had variable levels of expression 
ranging from very low to near normal. The fact that patterns 3 and 5 differ 
significantly with regards to expression, but are almost identical vdth regards 
to methylation of all regions except 7 suggests that this area is important in the 
inactivation of C4CiV^/G. 

10 

To confirm whether methylation of the 5' CpG island of CACNAIG is 
really associated with gene inactivation, 3 non-expressing cell lines showing 
pattern 4 methylation (RKO, SW48 and Raji) and 2 weakly expressing cell 
lines showdng pattern 5 methylation (MB-23 1 and MB-435) were treated with 
15 1 M of the methyl-transferase inhibitor 5-deoxy-azacitidine. After treatment, 
all these cell lines re-expressed CACNAIG mRNA, Consistent wdth re- 
expression, demethylation of region 7 was observed after 5-deoxy-azacitidine 
treatment (Fig. 3C). 

20 De novo cytosine methylation is thought to sometimes occur in vitro 

during cell propagation (Antequera et al., Cell QZ: 503-514, 1990). To 
determine whether the methylation of CACNAIG occurs in vivo, primary 
human tumors were examined for methylation of the 5' region of CACNAIG. 
Aberrant methylation was detected in 17 out of 49 (35%) colorectal cancers, 4 

25 out of 28 colorectal adenomas (25%), 4 out of 16 (25%) gastric cancers and 3 
out of 1 7 (18%) acute myelogenous leukemia cases. In colorectal cancers, 
there was a significant correlation between methylation of CACNAIG and 
methylation ofpl6 (p<0.005) and hMLHl (p<0.001), as well as a strong 
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correlation with the presence of microsatellite instability, and the recently 
identified CpG island methylator phenotype (CIMP), supporting that 
CACNAIG is also a target for CIMP in colorectal cancer, 

5 To determine whether aberrant methylation of the 5* region of 

CACNAIG affects the expression status of this gene in primary tumors, we 
performed RT-PCR using cDNA from a series of colorectal adenomas. Six 
out of 8 cases which showed no methylation of region 7 expressed CACNAIG, 
In sharp contrast, all 5 cases that showed methylation of region 7 had no 
1 0 detectable expression of this gene. 

Thus, a human T-type calciimi chaimel gene {CACNAIG) has been 
identified and cloned using the MINT3 1 sequence as a probe. The human T- 
type calciimi channel gene has been determined to be a target of aberrant 
1 5 methylation and silencing in human tumors. The data show that MINTS 1 (a 
representative sequence of MnSfTl-33) can be used as a probe to identify 
genes that play a role in disorders such as cell proliferative disorders. 

Detailed analysis of the CpG island upstream of CACNAIG revealed 
20 that methylation 300 to 800 bp upstream of the gene closely correlated with 
transcriptional inactivation. The CACNAIG promoter is contained in a large 
CO rich area that is not coordinately methylated in cancer. The CpG island 
around MINT3 1 is much more frequently methylated in cancers compared to 
that just upstream of CACNAIG. This may simply be caused by differential 
25 susceptibility to de-novo methylation between these two regions, with 
methylation of MINT31 serving as a trigger, and eventually spreading to 
CACNAIG, as described in other genes (Graff, et al., 1 Biol Chem. 231\ 
22322-22329, 1997), However, it is likely that these 2 regions are controlled 
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by different mechanisms because (1) cell lines kept in culture for countless 
generations do not in fact spread methylation from MINT3 1 to CACNAIG 
(e.g.y Caco2), (2) region 3 that separates the 2 islands is infrequently and 
sparsely methylated in cancer and (3) 2 cases of primary colorectal cancer 
5 were found which are methylated at the CACNAIG promoter but not at 

MINT31). Therefore, methylation of MINTS 1 appears to be mdependent of 
methylation of CACNAIG suggesting that they are 2 distinct CpG islands 
regulated by different mechanisms. These data leave open the possibility that 
MINT3 1 is the promoter for an unidentified gene, which may perhaps be 
1 0 transcribed opposite to CACNAIG. 

Many CpG islands of silenced genes appear to be methylated 
uniformly and heavily throughout the island (e.g., Graff, et al., J. Biol Chem. 
272 : 22322-22329, 1997). In contrast the methylation patterns of the 5* region 
1 5 of CACNAIG (region 5-7) was heterogeneous in the cell lines which did not 
express this gene. Nevertheless, methylation does appear to play a role in 
CACNAIG repression since demethylation readily reactivates the gene. 

20 The causes oi CACNAIG methylation remain to be determined. 

Methylation was not detected in normal colon mucosa, placenta, normal breast 
epithelium and normal bone marrow, including samples from aged patients, 
suggesting that methylation of this region is cancer specific. However, there 
was a significant correlation between methylation of CACNAIG and other 

25 tumor suppressor genes such asp/ (J and hMLHL Thus, CACNAIG probably 
is a target for the recently described GIMP phenotype, which results in a form 
of epigenetic instability with simultaneous inactivation of multiple genes. It 
should be noted that a gene identified by the method of the invention 
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(MINTS 1) has been successfully utilized to identify another gene of interest 
{CACNAIG) whose methylation pattern correlates with the presence of 
specific cell proliferative disorders. 

5 T-type calcium channels are involved not only in electrophysiological 

rhythm generation but also in the control of cytosolic calcium during cell 
proliferation and cell death (reviewed in Berridge, et al., Nature 225: 645- 
648, 1998). The results demonstrate that the expression of CACNAIG is not 
limited to brain and heart, suggesting that it may play a role in these other 

10 tissues. It has previously been shown that Ca^"^ influx via T-type channels is 
an important factor during the initial stages of cell death such as apoptosis 
(Berridge, et al„ Nature 645-648, 1998), ischemia (Fem, J. NeuroscL i&: 
7232-7243, 1998) and complement-induced cytotoxicity (Newsholme, et al., 
Biochem, J. 225: 773-779, 1993.). These studies determining the methylation 

15 status of the CACNAIG suggest that the unpairment of voltage gated calcium 
channels may play an important role in cancer development and progression 
through altering calcium signaling. 
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EXAMPLES 

EXPERIMENTAL PROCEDURES 

Methylated CpG Island Amplification. 

The procedure is outlined in Figure 1. Five |ig of DNA were digested 

5 with 1 00 units of Smal for 6 hours (all restriction enzymes were from NEB). 
The DNA was then digested with 20 units of Xmal for 1 6 hours. DNA 
fragments were then precipitated with ethanol. RXMA and RMCA PGR 
adaptors were prepared by incubation of the oligonucleotides RXMA24 
(5'-AGCACTCTCCAGCCTCTCACCGAC.3') (SEQ ID NO: 34) and 

1 0 RXMAl 2 (5'-CCGGGTCGGTGA-3 ') (SEQ ID NO:35), or RMCA24 
(5'.CCACCGCCATCCGAGCCTTTCTGC-3') (SEQ ID NO:36) and 
RMCA12 (5'-CCGGGCAGAAAG-3') (SEQ ID NO:37) at 657C for two mm. 
followed by cooling to room temperature. 0.5 |ag of DNA was ligated to 
0,5 nmol of RXMA or RMCA adaptor using T4 DNA ligase (NEB). PGR was 

15 performed using 3 |al of each of the ligation mix as a template in a 100 |il 
volxmie containing 100 pmol of RXA24 or RMG24 primer, 5 units of Taq 
DNA polymerase, (GIBGO-BRL,), 4 mM MgG12, 16 mM of NH4 (S04)2, 
lOmg/ml of BSA, and 5% v/v DMSO. The reaction mixture was incubated at 
727G at 5 min and at 957G for 3 min. Samples were then subjected to 25 

20 cycles of amplification consisting of 1 min at 957G, and 3 min either at 727G 
or 777G in a thermal cycler (Hybaid, Inc.). The final extension time was 10 
min. 

Detection of Aberrant Methylation Using MCA, 
25 MGA products from normal colon mucosa and corresponding cancer 

tissues were prepared as described above. One fig of MGA products was 
resuspended in 4 jil of TE (10 mM Tris pH 8.0, 1 mM EDTA pH 8.0), mixed 
with 2 \x\ of 20 X SSG, and 1 ^l aliquot of this mix was blotted onto nylon 
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membranes (Nunc) using a 96 well replication system (Nunc). The 
membranes were baked at 807C, UV crosslinked for 2 min. and hybridized 
using *^^P labeled probes. Each sample was blotted in duplicate. Each filter 
included mixtures of a positive control (Caco2) and a negative control (normal 
5 colon mucosa from an 1 8 year old individual). The filters were exposed to a 
phosphor screen for 24 to 72 hours and developed using a phosphorimager 
(Molecular Dynamics). The intensity of each signal was calculated using the 
Image Quant software, and methylation levels were determined relative to the 
control samples. 

10 

RDA, 

RDA was performed essentially as previously reported (Lisitsyn et al, 
1993) with the following modifications. For the first and second rounds of 
competitive hybridization, 500 ng and 100 ng of ligation mix was used, 
15 respectively. To eliminate the digested adaptor, a cDNA spun column 

(Amersham) was used instead of excising firom the agarose gel. Primers used 
for the first and second rounds of RDA are as follows : 



JXMA24 


5'-ACCGACGTCGACTATCCATGAACC-3 ' 


SEQIDNO:38 


JXMA12 


5'-CCGGGGTTCATG-3' 


SEQ ID NO:39 


JMCA24 


5 '-GTGAGGGTCGGATCTGGCTGGCTC-3 ' 


SEQIDNO:40 


JMCA12 


5 '-CCGGGAGCCAGC-3 ' 


SEQIDN0:41 


NXMA24 


5 '-AGGCAACTGTGCTATCCGAGTGAC-3 ' 


SEQ ID NO:42 


NXMA12 


5'-CCGGGTCACTCG-3' 


SEQIDNO:43 


NMCA24 


5 '-GTTAGCGGACACAGGGCGGGTCAC-3 ' 


SEQIDNO:44 


NMCA12 


5'-CCGGGTGACCCG-3' 


SEQIDNO:45 
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After the second round of competitive hybridization, PGR products 
were digested with Xmal. The J adaptor was eliminated by column filtration. 
The PGR products were then subcloned into Bluescript SK(-) (Stratagene). To 
screen for inserts, a total of 396 clones were cultured overnight in LB medium 
5 with ampicillin and 3 ^il of the culture was directly used as template for a PGR 
reaction. Each clone was amplified with 

T3 (5'-AATTAACGGTCAGTAAAGGG-3') (SEQ ID NO:46) 
and 

T7 (5'-GTAATAGGAGTGACTATAGGGG-3') (SEQ ID NO:47) 
10 primers, 

blotted onto nylon membranes, and screened for cross hybridization v^th 32P 
labeled inserts. The clones differentially hybridizing to tester and driver MGA 
products were further characterized by Southern blot analysis and DNA 
sequencing. 

15 

Southern blot analysis. 

Five |ig of DNA was digested v^th 20-100 units of restriction enzymes 
as specified by the manufacturer (NEB). DNA fi-agments were separated by 
agarose gel electrophoresis and transferred to a nylon membrane (Zeta-probe, 
20 Bio-Rad). Filters were hybridized with 32P-labeled probes and washed at 
657G with 2X SSG, 0.1 % SDS for 10 min. twice, and O.IX SSG, 0.1 % SDS 
for 20 min. Filters were then exposed to a phosphor screen for 24-72 hours 
and analyzed by using a phosphorimager (Molecular Dynamics), 

25 DNA sequencing and analysis, 

Plasmid DNA was prepared using the Wizard Plus Minipreps 
(Promega) according to the suppliers recommendation. Sequence analysis was 
carried out at the Johns Hopkins Gore Sequencing Facility using automated 
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DNA sequencers (Applied Biosystems). Sequence homologies were identified 
using the BLAST program of the National Center for Biotechnology 
Information (NCBI) available at http://www.ncbi.nlm.nih.gov/BLAST using 
the defauh parmaters of the web site. Putative promoter sequences were 
5 predicted using the computer programs NNPP and TSSG available through the 
Baylor college of Medicine launcher at http://dotimgen.bcm.tmc.edu:9331. 

Bisulfite-restriction methylation analysis. 



10 treated with bisulfite as reported previously (Herman et ai, 1996). Primers 
used for PGR were as follows: 



DNA from colon tumors, cell lines and normal colon mucosa was 



hMLHl, 



5'-TAGTAGTYGTTTTAGGGAGGGA-3' (SEQ ID 
NO:44), 



15 



5'-TCTAAATACTCAACRAAAATACCTT-3' (SEQ 
IDNO:45); 



MINTl, 



5'-GGGTTGGAGAGTAGGGGAGTT-3' (SEQ ID 
NO:46), 



5'-CCATCTAAAATTACCTCRATAACTTA-3' (SEQ 
IDNO:47); 



20 



MINT2, 



5'-YGTTATGATTTTGTTTAGTTAAT-3' (SEQ ID 
NO:48), 5 ' -TACACCAACTACCCAACTACCTC-3 ' 
(SEQIDNO:409); 



25 



Versican, 



5'-TTATTAYGTTTTTTATGTGATT-3' (VI) (SEQ 
IDNO:50), 5'-ACCTTCTACCAATTACTTCTTT-3' 
(V2) (SEQIDN0:51). 



Ten to 20 nl of the amplified products were digested with restriction 
enzymes which distinguish methylated from unmethylated sequences as 
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reported previously (Sadri et ai, 1996; Xiong et ai, 1997), eiectrophoresed on 
3 % agarose or 5% acrylamide gels, and visualized by ethidium bromide 
staining. 



5 RT-PCR 

m 

Total RNA was prepared from normal colon epithelium and tumor cell 
lines using TRIZOL (GIBCO-BRL). To study gene expression following 
demethylation, cell lines were treated with 1 M of 5-aza-2y-deoxycytidine for 
2-5 days. cDNA was prepared using random hexamers and reverse 
1 0 transcriptase as specified by the manufacturer (Boehringer). The expression 
of versican was determined by RT-PCR using the primers 

VF 5'-GCTGCCTATGAAGATGGATTTGAGC-3' (SEQ ID 
NO:52) and 

VR 5'.GGAGTTCCCCC ACTGT-TGCCA.3 ' (SEQ ID NO:53), 

15 

The PGR products were visualized by ethidium bromide staining. The 
cDNA samples were also amplified using GAPDH gene, primers 

GAPF 5 '-CGGAGTCAACGGATTGGTCGTAT-3 ' (SEQ ID 

20 NO:54) and 

GAPR 5'-AGCCTTCTCCATGGTGGTGAAGAC-3 ' (SEQ ID 

NO:55) 

as a control for RNA integrity. All reactions were performed using RT (-) 
controls where the reverse transcriptase enzyme was omitted. 

25 



Chromosomal mapping. 

The chromosomal location of clones that did not correspond to known 
genes was determined using a human-rodent somatic cell hybrid panel and a 
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radiation hybrid panel (Research Genetics). PGR reactions were performed 
using 30 ng of each of the hybrid panel DNA as a template in a 40 jil volume 
containing 15 pmol of each primer, 0.5 units of Taq DNA polymerase, 
(GIBCO BRL), 2mM MgG12, BSA and 5% DMSO. First denaturation was 

5 carried out at 957G for 3 min. Samples were then subjected to 35 cycles of 
amplification consisting of 25 sec, at 947C, 1 min at 60 to 687G and 1.5 min. 
at 727G in a thermal cycler (Hybaid). The final extension time was 10 min. 
Ten |il of the PGR product were electrophoresed in a 2 % agarose and the 
genotype of each panel was determined. Linkage analysis was performed 

1 0 using the RH server of Stanford University as described (Stewart et al , 1 997). 

Although the invention has been described with reference to the 
presently preferred embodiment, it should be imderstood that various 
modifications can be made without departing from the spirit of the invention. 
15 Accordingly, the invention is limited only by the following claims. 
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What is claimed is: 

1 . A method for identifying a methylated CpG-containing nucleic acid, 
comprising 

a) contacting a nucleic acid sample suspected of 
containing a CpG-containing nucleic acid with a 
methylation sensitive restriction endonuciease, under 
conditions and for a time to allow cleavage of the 
nucleic acid; 

b) contacting the sample with an isoschizomer of said 
methylation sensitive restriction endonuciease, wherein 
said isoschizomer of said methylation sensitive 
restriction endonuciease cleaves both methylated and 
unmethylated CpG sites, 

c) adding oligonucleotides to the nucleic acid sample 
under conditions and for a time to allow ligation of the 
oligonucleotides to the nucleic acid cleaved by said 
restriction endonuciease; and 

d) amplifying said cleaved nucleic acid. 

2. The method of claim 1, wherein said methylation sensitive restriction 
endonuciease is Smal. 

3. The method of claim 1 , wherein said amplifying is by polymerase 
chain reaction amplification. 
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4. The method of claim 3, wherein said amplifying by polymerase chain 
reaction amplification comprises annealing primers complementary to 
said oligonucleotide. 

5. The method of claim 1 , wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected firom SEQ ID 
NO:34 (RXMA24) and SEQ ID NO:35 (RXMA12). 

6. The method of claim 1, wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:36 (RMCA24) and SEQ ID NO:37 (RMCA12). 

7. The method of claim 1, further comprising adhering the amplified 
nucleic acid to a membrane. 

8. The method of claim 7, further comprising hybridizing the membrane 
with a probe of interest. 

9. The method of claim 1 , wherein the CpG containing nucleic acid 
comprises a methylated CpG island. 

10. The method of claim 9, wherein the CpG island comprises a CpG 
island located in a gene selected firom the group consisting of a pl6, a 
Rb, a VHL, a hMLHl, and a BRCAl gene. 
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1 1 . The method of claim 1, wherein said sample is selected fomi the group 
consisting of a brain cell, a colon cell, a urogenital cell, a lung cell, a renal 
cell, a hematopoietic cell, a breast cell, a thymus cell, a testis cell, an ovarian 
cell, a uterine cell, an intestinal cell, serum, urine, saliva, cerebrospinal fluid, 
pleural fluid, ascites fluid, sputum, and stool. 

12. The method of claim 1, wherein the presence of methylated CpG - 
containing nucleic acid in the sample is indicative of a cell 
proliferative disorder. 

13. The method of claim 12, wherein the cell proliferative disorder is 
selected from the group consisting of colon cancer, lung cancer, renal 
cancer, leukemia, breast cancer, prostate cancer, uterine cancer, 
astrocytoma, glioblastoma, and neuroblastoma. 

1 4. The method of claim 1 , further comprising performing representation 
difference analysis, wherein said representation difference analysis 
comprises hybridizing a driving nucleic acid as a driver. 

15. The method of claim 14, wherein said representation difference 
analysis uses nucleic acid isolated from a member of the group 
consisting of normal colon, normal lung, normal kidney, normal blood 
cells, normal breast, normal prostate, normal uterus, normal astrocytes, 
normal glial and normal neurons. 

16. A nucleic acid identified by the method of claim 1 . 

1 7. A vector comprising the nucleic acid of claim 1 6. 
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A method for detecting an age-associated disorder, associated with 
methylation of CpG islands, in a nucleic acid sequence of interest in a 
subject having or at risk of having said disorder, comprising: 
contacting a nucleic acid sample suspected of comprising a CpG- 
containing nucleic acid with a methylation sensitive restriction 
endonuclease, imder conditions and for a time to allow cleavage of the 
nucleic acid; 

contacting the sample v^th an isoschizomer of said methylation 
sensitive restriction endonuclease, wherein said isoschizomer of said 
methylation sensitive restriction endonuclease cleaves both methylated 
and unmethylated CpG-sites, xmder conditions and for a time to allow 
cleavage of methylated nucleic acid; 

adding oligonucleotides to the nucleic acid sample under conditions 
and for a time to allow ligation of the oligonucleotides to nucleic acid 
cleaved by said restriction endonuclease; 
amplifying said cleaved nucleic acid; 

adhering the amplified digested nucleic acid to a membrane; and 
hybridizing the membrane with a probe of interest. 

The method of claim 18, wherein the sample is selected form the group 
consisting of brain cells, colon cells, urogenital cell, lung cells, renal 
cells, hematopoietic cells, breast cell, thymus cells, testis cells, ovarian 
cells, uterine cells, serum, urine, saliva, cerebrospinal fluid, pleural 
fluid, ascites fluid, sputum, and stool. 

The method of claim 18, wherein the probe of interest is a nucleic acid 
sequence. 



wo 00/26401 



PCT/US99/25251 - 



-76- 

21 . The method of claim 18, wherein the nucleic acid sequence is selected 
from the group consisting of a pl6, a Rb, a VHL, a hMLHl , and a 
BRCAl nucleic acid, 

22. The method of claim 21, wherein said nucleic acid sequence is a pi 6 
nucleic acid sequence. 

23. The method of claim 1 8, wherein the sample is a tissue sample or a 
biological fluid sample. 

24. The method of claim 1 8, wherein the probe is detectably labeled. 

25. The method of claim 24, wherein the label is selected from the group 
consisting of a radioisotope, a biolunainescent compoimd, a 
chemiluminescent compound, a fluorescent compound, a metal chelate, 
and an enzyme. 

26. The method of claim 18, wherein said age-associated disorder is 
selected from the group consisting of atherosclerosis, diabetes melitis, 
and dementia. 

27. The method of claim 1 8, wherein said age-associated disorder is a cell 
proliferative disorder. 

28. The method of claim 1 8, wherein the nucleic acid of interest is a 
member of the group consisting of SEQ ID NOs:l-33 (MINT 1-33). 
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29. The method of claim 27, wherein said cell proliferative disorder is 
selected from the group consisting of colon cancer, lung cancer, renal 
cancer, leukemia, breast cancer, prostate cancer, uterine cancer, 
astrocytoma, glioblastoma, and neuroblastoma. 

30. The method of claim 1 8, further comprising performing representation 
difference analysis, wherein said representation difference analysis 
comprises hybridizing a driving nucleic acid as a driver. 

3 1 . The method of claim 30, wherein said representation difference 
analysis uses nucleic acid isolated from a member of the group 
consisting of normal colon, normal lung, normal kidney, normal blood 
cells, normal breast, normal prostate, normal uterus, normal astrocytes, 
normal glial and normal neurons. 
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32. A method for determining the response of a cell to an agent, 
comprising: 

a) contacting a nucleic acid sample suspected of 
comprising a CpG-containing nucleic acid from said 
cell with a methylation sensitive restriction 
endonuclease, under conditions and for a time to allow 
cleavage of unmethylated nucleic acid; 

b) contacting the sample with an isoschizomer of said 
methylation sensitive restriction endonuclease, wherein 
said isoschizomer of said methylation sensitive 
restriction endonuclease cleaves methylated and 
immethylated CpG-sites, under conditions and for a 
time to allow cleavage of methylated nucleic acid; 

c) adding an oligonucleotide to the nucleic acid sample 
imder conditions and for a time to allow ligation of the 
oligonucleotide to nucleic acid cleaved by said 
restriction endonuclease; 

d) , amplifying said cleaved nucleic acid; 

e) adhering the amplified cleaved nucleic acid to a 
membrane; and 

f) hybridizing the membrane with a probe of interest. 



33. 



The method of claim 32, further comprising performing representation 
difference analysis, wherein said representation difference analysis 
comprises hybridizing a nucleic acid as a driver. 
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34. The method of claim 32, wherein the agent is selected from the group 
consisting of peptide, peptidomimetic, chemical compound, and a 
pharmaceutical compound. 

35. The method of claim 32, wherein said agent is a chemotherapeutic 
agent. 

36. The method of claim 32, wherein said methylation sensitive restriction 
endonuclease is Smal, 

37. The method of claim 32, wherein said amplifying is by polymerase 
chain reaction amplification. 

38. The method of claim 37, wherein said amplifying by polymerase chain 
reaction amplification comprises annealing primers complementary to 
said oligonucleotide. 

39. The method of claim 32, wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:34 (RXMA24) and SEQ ID NO:35 (RXMA12). 

40. The method of claim 32, wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:36 (RMCA24) and SEQ ID NO:37 (RMCA12). 
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41 . The method of claim 32, wherein said cell is selected form the group 
consisting of a brain cell, a colon cell, an intestinal cell, a urogenital cell, a 
lung cell, a renal cell, a hematopoietic cell, a breast cell, a thymus cell, a testis 
cell, an ovarian cell, a uterine cell, an exocrine cell, and an endocrine cell, 

42. A kit useful for the detection of a methylated CpG-containing nucleic 
acid comprising carrier means containing one or more containers 
comprising a container containing oligonucleotides for ligation to 
nucleic acid, a second container containing a methylation sensitive 
restriction endonuclease and a third container containing an 
isoschizomer of the methylation sensitive restriction endonuclease. 

43. The kit of claim 42, wherein said oligonucleotides comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:34 (RXMA24) and SEQ ID NO:35 (RXMA12). 

44. The kit of claim 42, wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:36 (RMCA24) and SEQ ID NO:37 (RMCA12). 

45. The kit of claim 42, further comprising one or more containers 
comprising a primer complementary to said oligonucleotide. 
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46. A kit useful for the detection of a methylated CpG-containing nucleic 
acid comprising a carrier means containing one or more containers 
comprising a membrane, wherein said membrane has a nucleic acid 
sequence selected from the group consisting of SEQ ID N0:1, SEQ 
ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO: 
9, SEQ ID NO:10, SEQ ID N0:14, SEQ ID N0:15, SEQ ID NO:17, 
SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ 
ID NO:24, SEQ ID NO:27, SEQ ID NO:30, SEQ ID N0:31, SEQ ID 
NO:32, and SEQ ID NO:33 (MINTl , MINT2, MINT4, MINT6, 
MINTS, MINT 9, MINTl 0, MINT14, MINTl 5, MINTl 7, MINTl 9, 
MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, 
MINT32, and MINT33 immobilized on said membrane. 

47. An isolated nucleic acid comprising a member selected from the group 
consisting of SEQ ID N0:1, SEQ ID N0:2, SEQ ID NO:4, SEQ ID 
NO:6, SEQ ID N0:8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 14, 
SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:20, SEQ 
ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:27, SEQ ID 
NO:30, SEQ ID N0:31, SEQ ID NO:32, SEQ ID NO:33 (MINTl, 
MINT2, MINT4, MINT6, MINT8, MINT 9, MINTIO, MINT14, 
MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, 
MINT27, MINT30, MINT31, MINT32, and MINT33), and degenerate 
variants thereof. 

48. The nucleic acid of claim 47, wherein said nucleic acid is methylated. 
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49. The nucleic acid of claim 48, wherein said nucleic acid is 
unmethylated. 

50. An substantially purified polypeptide encoded by the nucleic acid of 
claim 47. 

5 1 . The nucleic acid of claim 47, wherein said nucleic acid is operatively 
linked to an expression control sequence. 

52. The nucleic acid of claim 5 1 , wherein the expression control sequence 
is a promoter. 

53 . The nucleic acid of claim 52, wherein the promoter is tissue specific. 

54. An expression vector containing the nucleic acid of claim 47. 

55. The vector of claim 54, wherein the vector is a plasmid. 

56. The vector of claim 54, wherein the vector is a viral vector. 

57. The vector of claim 56, wherein the viral vector is a retroviral vector. 

58. A host cell containing the vector of claim 54. 

59. The host cell of claim 58, wherein the cell is a eukaryotic cell. 

60. The host cell of claim 58, wherein the cell is a prokaryotic cell. 
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61 . An isolated nucleic acid sequence comprising a methylated nucleic 
acid having a sequence as set forth in a member of the group consisting 
ofSEQ ID NOs:l-33. 

62. A method of identifying a compound that affects methylation of a 
nucleic acid, comprising: 

a) incubating components comprising the compound and a 
sample comprising a nucleic acid sequence identified by 
the method of claim 1 under conditions sufficient to 
allow the components to interact; and 

b) determining the effect of the compound on expression 
of the nucleic acid sequence. 

63. The method of claim 62, wherein said sample is a cell. 

64. The method of claim 62, wherein said sample is a substantially purified 
nucleic acid. 

65. The method of claim 62, wherein the compound is selected from the 
group consisting of a peptide, a peptidomimetic, a chemical compoimd, 
and a pharmaceutical compound. 
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Figure 1 
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Figure 2 
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Figure 3 
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FIGURE 6A 

MINTl (SEQ ID N0:1) 

CCCGGGCTGG GTACCTGQAC CTATACCTTC ATAGCTGCCT TAGGCTCAAC TTTTCGGCGG 
GGATCCCTCT GCAGACGTGC AGGTGGCGGG AGAGCAGAGG TAGCCGCAGT AAGTGCTQAG 
AGAGCCTGAA AGAAACACCA TGAATTTTCA AACTCTCCCA CATACATTCC CGAAGCGCCT 
GTCTGGCGTC TAAGAGAGAG CAAGAGAGGG CTGGAGAGCA GGGGAGCCCG CGGGGCTGAG 
GCTCTTTGTC AGCGCCTGCA CTTCCTACGT TACAACGCCT TCATTCAGCA AAAACCTTTT 
GGGCGCCTGC TGTGCGCCAG GCCAGGCGAA GNAGACCGAG GNTGTGAAGC TCAGAGGGGA 
GAGGGACCAA TCGCAGTAAA TAAGCTACCG AGGTAATCTT AGATGGNGAT GAGGGCAGGA 
AAAGNCATCA GNCGACCTCT GACCTTTCTC TTAGGGGGTT TTCCCCTTCC GCCTGGGTTC 
TAGAACTGGG AAGANTTTTC TCCAGAGCGT CGCGGGGAGC GCCCCGGG 

MINT2 (SEQ ID NO: 2) 

CCCGGGCGCT GCCAAATGTA AACAATCCGC CATGATTTCT TTGTTTAGCT AATCGAACCT 
GCCGCCGTCT CGAGTCCCAG GCGCTCCCTC TCCCTTCTCT CCCCTTCCCC CCTCGCGCTC 
TCTTTTATTT ATTTATTTGT CTCTCCCCCC ACCCCGCCCT TAGTCTTTCC CTCTCTTTAG 
TCTTAGTAGC TGCTTTTAAT GGAAGTCGAG GCAGTTGGGT AGTTGGTGCA GGGAGTGCGG 
TGGTGGTTTT TATAAATACA GGTAAAATAA TACCCAAATT TCAGGCTGAG GTGCAGTTTC 
TTGGAAGAGG AGGAGGGTGT TCCTCTCCTT CCCTCCCTCT TTCCCCCTCC CCCGTTTATT 
AGAGTATCCT CGGTGGAAAG TGCCAGAAAA ATGTGCTGCA TCTCTGAGTC ACCTTTTCTT 
CCCGCCGAAC TCTAGCACCC AAGTTCGCTG GCGGATTTTG GACTCTGCCA AAGTGCTGAG 
TTCGTCGTTT ACTTTGAAAG CCTGAAATAT AACTGATGTC CAACTGCAQA AAGGCGCACG 
GAATCGCCGC CACCATCCCG GG 

MINT3 (SEQ ID NO: 3) 

CCCGGGTTTC CAGCTTCTCC CTTCCACCTT TGTCTCCCCT CCCTCCTACA AACTTTCAGC 
CTTCAGTCTG TTGGGGGNTA AGATCTGGGA AATAAGCGTG TGTGTCCGAG TGCCTTAGGG 
TCTGCTGTGA GCCTGAATGC GAGTCCGGTT GGTTGTGCAG TTATGAACCT GTGTGTACAT 
CTGTGTGCCT AAGACGCTGG GCAACTGTAC CCACATGACC GATGTGTGTG AACGACTGTG 
TGCCTGTGTG TCTGGATCTG CGCGTGGGTG TAGTTCGTGT GGCTGTGTAA ACCGCATGCA 
TAGGTATACA TGTACCTATG TCTCAGGCTG TGTGCGTCCT ACTGCGATGG TACGAGAGTG 
TGGGTGTGAT GGTGTATGTG ATCCTGTGTC TGCGTGTCCG TACATTTGAG TGGGTGTTGT 
GCGTGTGACT GTGTAAACTG CGAACATGTA CGTGTGCCCG CCCGTAGGTA TTACCGTGTA 
CGTGTGTCTC CGTGTGCCTG TGAGGGGTGG GGTCTGCGCG GGGATTCCCG ACCCCCCCAC 
ACTCACACCC TCCAAGCCCC GGG 



MINT4 (SEQ ID N0:4) 

CCCGGGCCTC TGGCCCTCTG CGTCTGCTAG NCTCTTTCCC CCAAGACTCC CCGAGGTGGG 
GAGAGNACTG GTGNTCCCTG GAGAAATCAA GGTGTCCAAC ATTCTCTCCG AGGCGAGGCT 
GCTTGAGCGC CAGCAACAGG NCCTGCTGAA CTTTCTTCCC CGGCTCCTAC GCTCCGGTTG 
CTCTCCATCC TCATTTCTGG GGTCAAATGG NAAAGAGGGA ATACTCCTCG ACCCCTCTCC 
CCCTTGACTA TCCAAAGCAG CCCGAAGTTG GCGAGGAGAC TCTGCCGGGT GTNCGGGCAA 
ATGNCCCGCC GGGTGGCTCC AGA2^TGGNC TGTGANCTGC ACTCGCCTCG GAGAAATTCC 
AACTCTTGGT TGAAGACTCT GACTCAGAGG AGCCCTCTGA GGATGCGCCC CTGGAGAAAG 
NGCACGGGAG GGAAAGTGGA GAGAACTCGN CCTCCCCAGG GGCTAGNCAG CTACTCCCGG 
G 
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MINT5 (SEQ ID NO: 5) 
CCCGGGTAAG TGAGCCCTGC TGCACTCCCG 
ACCCTCCGGG GATGCAACAC CCTGTTCCCA 
TCACACTGTC CCCACCCAGC TTCAGGCTTG 
CTGAGGAAGC AGTCCCAGGG CATTTACTGA 
GGGAAAAGTG AGTAGGTGGG TCTGCAACCG 
TCTTAATTCG AGTATATAAG GATTGGCATC 
ACTGTCCTTG TCATTTGGGC AAAAGACCCC 
ATTCCTCANA AGTGTCCTCT TGGAANAANC 
TCCTTAACAC TCTCCGGCTT CCCCTCAATC 
CCTCAAGACC TCCTTTAATC CCAAAGAGNC 
GGTGCTTTCC TTTCCAATCA AGAATATGTT 
ATGGACTATG TCCCCAATTT TAAAAGATGG 
ATGCTAAAAA CCATGGGAAC AGGATCCAGA 
CTGCCAGTCT. GTCCTCGGAG ATCCTTTGAC 
TGTTGTGGTT TGGGGGTTGA TCTTGGAGAA 
AAAATGTTTC ATTNGTTAAC TTTCCAAGTG 
TCAACACCAA GAAGAGTAGG GAAAGAAGCA 
GG 

MINTS {SEQ ID NO: 6) 
CCCGGGCCCT CAGAGGCCGC ACCACCTATT 
GCGAGAGGTG TGTGGGGGCG GAGAGTGGCA 
GACAGTGGCT GCAAAGGCAA AATCGGGTGT 
GCCGGGGTCA GCTCGAACTG GAGCCTGTAA 
GATCCTTTTC ATAGGTGAGG TCCCAGGAAC 
TCCCACTTGT CAGCCTTGTT GTTTACCCAT 
CGNNGTCGGT TCCATCAACC CCGGNATCCC 

MIMT7 (SEQ ID NO: 7) 
CCCGGGACCC GCCCAGAACG CTTTCGTGGG 
GCCCGGTAGT ATGCGAAGAC ACGCATAACG 
TTAAGGCCCT GAACACATTG AGCAAAAAGT 
GTCTTTCATA GTAAGGACTT TATTAAAAAG 
GAGGCTGGGG CTGGGGATGG GGACGTCCTC 
TCTCTGCTGC CTACTCCTAA ACGCAGCCGG 
CAGTCCCAGA GCCAGGTCGT CCATGGGGGT 
CTAGAAGCGA ATGGATTTTA TTCTCCCAGG 
G 

MINT8 (SEQ ID NO: 8) 
CCCGGGCTGT GGCGGTGCAA CTCCCGCCGC 
CCATGCATTA GTAGTACTAC TCCATTATTC 
GCCCAAAGGC CCCAGGAGGG GGGTCATGTG 
TCCGGTTCGA GTTATGCCAT CAAGCTAATA 
AAGGCTTGCA GCTGCCTCCA AATCAATAGA 
CACAAAAACT TAATCCTGGN TTGGAGGCTA 
CAAGNCACCC GATTTAATTT ATCCCCAAAC 
TTTTCCCAGC AGATCCTGCT ACGTCTGTCG 
TTCATGTGGT CCGGTGCCTT GAACC ATC TT 
AAGAAAGACA ATTACCAGAT GGTCTTTTTT 
CGGGGTCTGT CCCCGGG 



CACCCCTCTT CCCCATGCCC 
TGGAACACGG GGGTTGGCAG 
GTCTCCTCTA GGTTTGCCTT 
CCAANCAGAA AACAGGGGTT 
TTACAATCAC ATCACTTTAT 
ATANTGGGAT GANGAAGGTT 
TACCCATATC TCAATGACCA 
TGANTTTTCC CCTCCGTAAN 
CCAGGCCTTC CCCCTATTGA 
GTTGACTTCC NCCAAATGCG 
TAAAAACCCT CCCAGGGAGT 
AGGAACAAAG GCCCATTGGT 
TTTCCCCCCA TCAATTCGAN 
TTCTTGGAAT ANCCTTTTTG 
CTTTTTTGTG TGTCTTTTAA 
ATGCTCTGAT TGGAGCAATC 
GCGGNGGTCC TGGGTCCCCG 



GTGTTCCAGG CTCGCAGGAA GCCAGACCTT 
CAGGTTTGAC ACTGCAGGTC GGAGGAGGAA 
TATTTTCCCA AGAGTCCCTT CAGCGTGAGT 
TTTGTGAGTG CGAGTGGGGA GCAGCAGGAG 
GAGCCTGGTC NGTGCTTAGG CAAAGGCCCT 
CCCCTGCTTC TCCCAGACTT GCATTAATTC 
CTCCCCCGGG 



GTTGGAGAGG GCAGGACACA GCCTCTCTGG 
CAAAAGGATT CCCGTCCTGG ACTTTGGGAA 
AGATCTGTCT GTACAGACGT TTCTTTCCAC 
CAGGCACTCG AATCCTAGGT GGGTAGATGG 
TGTTTTCTGG TTGTGCACAT TAAAAATAAC 
CAAAAATGAG ACGTCAACTA AGCGCCGTTT 
TTTCAAGCGT TTTCTCGATG ACTGATTTTT 
TGTAAAGGCT ACCTCCTGCT TCACACCCGG 



CTGCGTTCTA GACAGAAAAG CCCCTTCTGA 
CTGTTAGAAC AAGTTAAAAG TAAGGGTTGA 
CGCCCCAGTC ACTCAGGCTC CCCTCGCTTC 
TATTGTGACT GCTCTTCTCT CCTGTGACAA 
TTGTCAAAGA AATATTGAAA ACAATCATGA 
CATAATCAGA AATTGTGCTA CTTGTTCTTC 
TTTTAGGCAA ATTTTTATTT CCCGGGACCT 
GGTTTGTAAT GTAATTTGTA ATTNCTNCCC 
TAATTAAAAG CATAATTAAG GGAAGATCTA 
TTAGAGGCGG TAGTTGCGCA GAGAGGGGCT 
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FIGURE 6C 



MINTS (SEQ ID NO: 9) 

CCCGGGTCAT TGTCCATCTC CGACCAGGGG AGTAGCCACC CCCACTAGCC AGCCGTCTTT 
ATCTCCTAGA AGGGGAGGTT ACCTCTTCAA ATGAGGAGGC CCCCCAGTCC TGTTCCTCCA 
CCAGCCCCAC TACGGAATGG GAGCGCATTT TAGGGTGGTT ACTCTGAAAC AAGGAGGGCC 
TAGGAATCTA AGAGTGTGAA GAGTAGAGAG GAAGTACCTC TACCCACCAG CGCACCCAGT 
CCCTCTCTCA GCAGTGGATG GGGATGGGGT GGGGGTAGGG ACGAGAAGGC AGCTGGTGGA 
GAAACAGCTT CAAGACTCTT TGGGTTCCTC CTGCTCTCCA GGGGAGCTTA CCTGGGGCTA 
ACTTTAGACA CACAGGTTTG GAGGGAGGAA AGAAGGAAGA ATTCTTTTAC AACGAATCAA 
TTAAGAGCAC TTCCTCTTTT CTTAACTTGG GGGGAGGGCC AGGAAAACTT CTTGAGTCAA 
GAAATCTTGG GAGGNAGACA TCAGATGTNG GCAAGAGGCA GACAGATTTT GGGAAGGCAG 
GCTTTGGGTC AANAAAGATC AAGCCTGTGT TGTCCCCAGG CTCTTCCCTG TTCCCCCATC 

CCGGG 

MINTIO (SEQ ID NO: 10) 

CCCGGGGGCT GCAGAATCAG GAGTCTTCNC CTAGGTTTGG CCTTGGGCTC CATCCTACNC 
CCTGAATGTG ACNGCTGCGT CTTCGTCCCA CACCTACTTT TGAATGCCAA GAGGGGGCTC 
CGCTGGCCAG GACNGAATAT TTTTATGGTA AAAAATGACC GGCAGTTGCA TCAGCTCCAG 
GAGGGTGGGA GCCGTCACCC GAGGTCGCAC AGGCAGACTG ATGAAAATTC TGCTTATAAA 
GTCACTGCTC CCCCATTAAT TAGGGGGGAG GGGGCGCTCC GGAGCCACCA CGCACCTCGC 
CCACGGNCAA AAAGCTTGTC AACATTTTCC ACGAAGGATT GAAAATGTAA ATTAACTTTC 
AGATTATTCA ATGTCACCAA GGTATGGAAA AAGGTCGCCA TACTGGGTGT CATTTATCTC 
GTTGTGGATT TAAAGAGCTT TTTCTATTAA ATTTCTTAAA ATTAATGTTT TATGTTGCTC 
AGAGTAATTT AAACAATTAT GGGCTTAAAG AATTGATCAT TACAGCCCCT GGGATTTAGC 
GCTGCAGGCT GATNNCCCTG AAAAACCTCT GATTTATCAG GGNTCGTATT NGGCCGGGCA 
AGCCCGGG 

MTNTll (SSO ID NO* IX) 

CCCGGGAGTG GCTAACCAGG AANANNAGGC ACTGNCCACA CACCANGGGC TGGGAAATCA 
AGTGGCCTGC ACCAAGGCGG CTTCGGGGGA CTTGTCTGTG GCAAGTCTTG GTAGTCCCCA 
TTCAAACTTT TGCCTCC3AGC GTGTTAAGAA CAACAACAAA AAAAAAATCA AAGTGCCAAA 
GGTCCCTCTC TTCTCTCCAG CTCAAGAACC CACCACTTTT CTATGATTTC TTTACAATTT 
ATTCCCTCCC TTCCCCCAAT TCCGTTAGTC ACTTTACCCC CACCCCACCC TGGGTTTCTT 
TTGTCTGAAT CTTTTTCAAC ACCAAGGTCC CTCTGTATGC CTCTCCCCAA AAGCCCTTAT 
GAAAAGTTAC CTGCATTTTT TAAGTGCCTA CATTTCTTAA . CTTCGCCTAA CAGCTCTTTG 
CCTTAATTAA AGCCTTCTAC CAATTGCTTC TTTTTTCTAA GCTCGCGGGT TTTTTTCAAT 
AAGTTTTTTG TTTTTGTTTT TTAAGGGGGG AACAAAAGAA ACGTGATTAC CTTGGAAGGC 
GGCTTATTGC AGTTTGGGGG GAAAATTCAC TGCAGCGCTG CGCGACTGGG TTCGGCGTTG 
CCCAGGCGGG TCACATAGGA AGCGTGGTGG CCCGGG 

MINT12 (SEQ ID N0:12) 

CCCGGGTCCC AGCCCTGAGG ACCAGGTTTC AGGGCTCAGA AGACTCCAGC 
GAGGTTCCCT CGCAGATTGT GTCTGCGGTC GTTGGGGGAG GGGCCCCGCA 
GCCTTCAGCA GATTATCCAA AGGTCAGTGA CCCAGATATG GTTTTGGNCA 
CGGGCCATGT TTCACTTCCT GTGCCCCAAG CAGAATTTAG CTGAATAATT 
CGGACCCCAA ACCAAACAAA ACGCTCTTAT TTCCGTTTGG GGATTCTTCG 
GAGTTGGGAT TTTTCTGTCT CAAATTAGAA TAATNTGCAT TATTAACCAA 
CTTAACATTT ATTTGTTGGT TGGCTGCCTG ACCTTTCTGA GCCTCAGTTT 
NTTCGTCTGT AAATTGGGAG CTTACCCAGG TCGGAGGACT GTTGGAATTG 
GAAATATTCG AATAAGGAAG TGTTTTTGCA AGTGCTTTGT AAGCAGCAAA 
GCGCTTCTTC AGGGGTCAAT TTTTTTTAGC TCTGCAGTCA CCACCCAAAT 
TCGGAAGATC GTTGTGCCTT TCTTGGATGA GAATGCCCGG CTCCAGCCCG 
GG 
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MINT13 (SEQ ID NO: 13) 

CCCGGGTGTG TGTTGTTCCC CTCCATGTAC CACGTGTTTG TCCTGATGGT CTCCTACCCC 
CTGTCCCCCT GAGAGGCCCT GGTGTGTGTT GTTCCCCTCC ATGTATCCAC GTGTTTGTCC 
TGATGGTCTC CTACCCCCCG TCCCCCTGAG AGGCCCTGGT GTGTGTTGTT CCCCTCCATG 
TACCCACGTG TTTGTCCTGA TGCTCTCCTA CCCCCTGTCC CCCTGAGAGG CCCTGGTGTG 
TGTTGTTCCC CTCCATGTAC CCACGTGTTT GTCCTGATGC TCTCCTACCC CCTGTCCCCC 
TGANANGCCC GGG 

MINT14 (SEQ ID NO: 14) 

CCCGGGAGTG GCCCTGCCTG GCCATTTGCT CAAGCAGCAT GCAAGCCGGA TCTACAGCCG 
GTGCACCTTG TTCCTTGTTC TTCCGGGCAC CGGAGGCCCA CGTGAAACCT CTCAGAGGAG 
AGCGAAGAAG GCCACCTTTC ATCTCAATGA GCCAACAGCC TCACATCCTT AAGTCCTGCC 
TAATCTTACG AGTCATGAGT CATCGCTTCT TCCCCGCCAA GTCATTCAGG ATTCAGTGAC 
TCCAGCTGCC CTCAGGATCT GACGQAATCT TGAAGGCAGA GTTCCGAGAC TGCAGTCCAG 
GATAGAGAAG CCCCCGGCTC CATCAGGGGC TCCTCGGCTT CAAGGCAGGA CCCACCCCAA 
AGCTNTCAAA GCGGCAGAGG CCTGTTTTCA GGTCTATTTT TAJ^GATNTC TAGGGGAAGT 
GGTTTCTGAA TGCTCTGTAG GATGAGGCTG TGAACAGTGA TTGTTTCATT TGCTCGGGGC 
TGGCAAAAAA GGGATGTACA ACCCGTTCTG ACCACACCAG TGAGTTAAAA AGCACTGCAG 
ACTATGAATA CCTTTTGCCA GTTGGACTGG TTATGGGGAC CGATGGCTTC CCCTACAACT 
GGGGAGGCTG CCAGCCCGGG 



MINT15 (SEQ ID NO: 15) 

CCCGGGACCT CTGCGGTTAC CTGGGCCTTC CCTGCCAGCC CCCACCCCCT GCCCCCACCA 
GAAAGCGTTT ACAGGACAGG TGAGGCCCGC AGGAGGAAAA GCACTCCCCT GGCGCAACAT 
GACTCCAGCG CATCTGCGTC TAAGCCACAC CGTGCTCCTG GTAGATTAAA AATTAATTCT 
AAAAAAAAAA TCTCTCCTAT CCCAAATGCA CTGTTTTCTG CCTTGCTTGA CAATTGATTT 
GTTTTTAAAG GAAAGTTATG GGTAGATCCT CTTTTTTCTT TCCCATTCTT TNNTTCTTCT 
TTTATACTGG AGGGAGGGAA ACGGAGGCGA GGACACACAC GCGCAGGCAG GGGNTGAAAA 
GGCCGAGGTG GGTTTTCCTG TTTAATATCA AAGGAGGGCG AATAATGGGT TTCCTCGGTC 
CGGCTAGGCC GGCCTTTGAC TCAATTGGAA ATGCAAAGGC AGCTTTTGCC TATTNTCTGG 
CTGCTGGCTG AGACCCTAAA TTTCCGTAGG AAATCGTCGG ACACGCACTT AATCGGNCTT 
TGCAANCTTT CCCTCGAAGT TGCACGCGGG TCTGGGCGGA GGAGGCGAGG TAACCCTGGA 
TTCGAACCAG CGCCTTTCTC TCCTTCAGGC CTCCGCCCGG G 

MINT16 (SEQ ID NO: 16) 

CCCGGGACAA GGCGGGTCAC CTCTGGGGCC TCACCGCAGT TCCACTTCCT TTCTCGGGTA 
TTTGGAAACC GTCACCCCGC CATTTCGGTG TGGGAAGAGC GCGCGGGCCC TGCCGGACTT 
TAGTGCTTTA GGGGTTAATT TCGGGCTGAC AGGGACGGAG CCTAAGGCAG TGAGCGCCCC 
AGTACCCTCA AACCTTATTG CTGGCCCCTG CTGTCTGAGC TTACAAGCAT TACCGCCGCT 
ATTTCCGTGC GGGCTGACAC GGGAGATGAA AGTGGTGAAG ACACCCAGGG TGCGGGGGTG 
GAGGTGGGGA GAGGAGCCAG ATGGGATTGA TCCCCAGAGC CAGATGGGAT TTAAAGGTGA 
GGGAGGAGGG CATCCTGATG GCGTGTGGTC AGTTGATGCC AGATTGGATG GCTGAGACAC 
CTCTGCAGCT TACAGGAAAG ACAGAGGGAA AGGGTTCTAT GAATTCTAGC TGTTCATACT 
CAAAGCAAAT AATTAATCAA GTGGGGGGGG GGCCTCTAGC TGTAAACCCA TACCTCTAGG 
AAACCTTTTG TCATGTGGAG CCACAGTGCT CACTTGACAG ATTCCCCACT GAGAAGTGGG 
CTAAGAGGTT GGCCTGCATT GCTGGGTGCC TCCAGGTGGG GAGTCCTGTA CCTGGGAGCC 
CGGG 
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MINT17 (SEQ ID NO: 17) 
CCCGGGCGCC GGGGGTCCCA TCCACTTCTC 
CAGGGGCTCC CGCGACTGAA AGGGGCCAGG 
AACATTTTGG ATGGATCGCG GGGTGCAGAA 
AAAGCGTGGA ATTTGGGAAG AGGCAG AAA A 
GCNACGAGAT GGGGAGGGAC AAGAAGTTTG 
GAAAACAAGT TGCAAGACAA TTGAAAAGTT 
AATCAAATTA GAACTGTTGG AAAAGANGGC 
CCAAGAAATT GGAAACCTCT CCCCTTCCCG 
CTCCCCCGGG 

MINT18 (SEQ ID NO: 18) 
CCCGGGAGTG CTCCTGCCGG CAGCATGTCC 
GTCCCCCATG TGCCAGCCTG TCCCCACACT 
AGGTGTGACA CCTCTGAGAC TGTGGGCACA 
CTGTGCCAGT GACAGAGGCC ATGATGCCCA 
CCCTAACCCC TAACCCTAAC CCTACCCTAA 
CCTAGCCCTA GCCCTAAGCC CTAAGCCCTT 
CCNAACCCTA ACCCTTAACC CTTCCCtCAA 
TTATTNAACC CCGGG 

MINT19 (SEQ ID NO: 19) 
CCCGGGATTG GTCTTTTGGC TGGGATGTAA 
GTAAGGGGGA AAACCCAGGG AATTTAACCC 
CACTTAGCAA ATCTTTATGG AGCTCTGTAG 
GAAAGCAAAG AGCAGGAGAG TGTGGACTCA 
GGTGGGGAGT GGGAAGGGGT TTCCAGGTGC 
AGGGCATTTG GGGAACGTAT GAGAAGCGCT 
CCTGCATCCA TTTTGGACCA TGAGCCCCTT 
TAGACTTCCC CAGATTGCCC GGG 

MINT20 (SEQ ID NO: 20) 
CCCGGGCGCC AGGGCGGCCC GAACCCCAGC 
CAGAAGCGCC ACCGGACGCG CTTCACCCCC 
GCCAAGACTC ACTACCCCGA CATCTTTATG 
ACCGAGTCCC GAGTGCAGGT ACGAGGGGCT 
CGGGAGGATT TGGGCAAAGG GAGCAGGGTC 
TTTCAGGCTG CCTGTGCGTT CCTGTATCGA 
CCCATCGCCA TCCCCCAATG GACACGCAAG 
GGAAGAAGCT CCTCAAAATC GAAGCCCGGC 
CCTGAGAAGG CAACCTCAGC GCCCCCCGGG 

MINT21 (SEQ ID NO: 21) 
CCCGGGAACT ACCTAACGCT AGTTCAGTCC 
CTCCTTGCTT CCTCTAACAC TCTGGCACAC 
AGTG7AGCCC TGAGCCTGGG CTGCCCCTTC 
CCCTTTGGGT GCCCCATCCC TCCAGTCAAC 
GTAGCCTTCC CTGACATCCC TCCCAGGCTG 
CCTCCACCCT CCTTACAGCT ATACCTAGCC 
ACCCTGGCGC GCAGAGCCAC CGCAAAGTGG 



CCTACCTTCT CTCTCTCCTG TTGGAGAGGC 
CTGAGGCTGC CCAATCCCAG GGACAGGAAG 
AAAGGAAGTT GAGCGACAGA CGCNAAGCCC 
AAAGTGGGCG AAGAGGAAGA GAAAAAACNA 
GGGAGGATAG AGAAAAAGAG AAACTGCCTG 
GAATGAGAAG GAAAGANAGG AAGGTNGTNA 
TGGGACACGG CTTTCTCTGG TTCTGTCCTC 
GCACCAANCT TNCGGGATGT TCCGGTGCCC 



ACTTGCTAGG GGCAGAGGGG CAGTGGGAGT 
TGGGTTAACC TTCAGTCACC AGGAGGAATA 
TGTGCTGCTG CCTGACAGTC ACAGGCTGTA 
CATAACCCAA ACCCGAACCC GAACGCTAAC 
CCCTAACCCA GCCCTAACTC TAGCCCTAGC 
AAGCCTAACC CCAAACCCCC AACCCCAACC 
GCCTCTCNAA CCCTGCTTGG GTTTACAAGG 



AGGAGGAGGA ACTAGCTGGG GAAAGGCTGG 
CCTCTTCTGT AAAATGAGAA CCCATTCATT 
GGGCCCTGGG GGAAGGCGAC CAGAGTGTCT 
GCTGAGAGGA GAGCAGAGGC TGAGTGTCAG 
CAACAAGGAC ACAAGCCAAG GTGTTGCAGT 
CCTGCCAGCT TCCCTGGCGG TGTCCTATTC 
CTCTTACCCT CTGGCCAGGA CCGAATGCCA 



CAAGCCGGCC AGCAGCAGGG CCAACAGAAG 
GCACAGCTCA ACGAGTTGGA GAGGAGCTTC 
CGTGAGGAGC TGGCACTGCG TATCGGGCTG 
TGGGATCTGG GACAGAAGGC AAGGACAGGG 
TTCCCTTCCC CTGTCGAGAT CCTGGGCTGC 
GTTATCTCCA TCTCTACCCG GAAACTGGTC 
GCCCGTCTCC GGCCAGTATA GCGACATCCC 
GTTGTCGGGC TACAGGGCTC GCCTCCTCCG 



CAAAATGCTG CCCAACGACA GAATGCTCGC 
CCACTTGGTG TCGGGCCTCT ATGGGCTCGC 
CCATGTGCCC CCTGCCAGCC GGCCCTCCCT 
TCCTAGCCGA CCCTTAAGAG TCAGGTATTT 
TCCCACTGCC AGGAGGAGGA GCCTGCCCCT 
TTGGCCATAA TCACTAATGG ACCAGGAAAC 
CCCGCTCAGG CCCGCCCGGG 
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MINT22 (SEQ ID NO: 22) 
CCCGGGGATC TCAGAAAGGG AAGGATGTGG 
TCCTGGTGAC CTTGGCGTGC CCCCCTGGAG 
TAGGATGTTG CCTCGTTTTT CACTCGGGGC 
TGGAAGACAG GGGCAAGGAG AGAGAGAACC 
CTGGGGAGTT TGGAGCTGGG GGTTCGGAGT 
ACTCTCCGGA GGCGGCAGAG GTCGAGGCAG 
GTTGATGTTG GGCGCCTTTG GAAGCTGGTC 
ANGGCGACAG CGCCCGGG 

MINt23 (SEQ ID NO: 23) 
CCCGGGCGCC CGGCCCTGGC TCGCGGAATG 
AGCTCAGTCC CAGTTCCAAC CGGGGGTGCC 
ACAGCTAAGA CACCAGGCTG CAGGATCACT 
CTCTCCCGTG CGCAAGAACA AACGCGCGTG 
AGTGAATGCA AAATCCAGGG GACTCAGGGT 
TCAGGGAGCT GTTGAGGTGG GATCGGTGAG 
GGCTCGGATA CCATGCAGCG TGGACACTCC 

MINT24 (SEQ -ID NO: 24) 
CCCGGGGACG GGGAGGGAGG AGGGCTGCCG 
GGAGAGCAGC GCGGAAAGGG GGCCCAGGGA 
AAGGTGGCGG GCTACGGGGA GGGGAAAGAA 
AGAAAGGGGC AAGGGGTATA ATTGACAAGG 
AAGGATCTAC CGAACTCTCG GCGGTCCACG 
GCTCCACGCG GGGACAGACC TCAGCCCGTG 
AGCCCGTGAG CCTTGAGCTC CACGCGGGGA 
AAGGAGTGGC AACCTCANGA CGTTTGCCAA 
CCGCCAGGAA CANANCTGGC ACTAATTCCC 

MINT2S (SEQ ID NO: 25) 
CCCGGGGTGG GAGCTGGCTC GTGGGGGCGT 
GAGCAACTTT GCGGCGGAAG GCGCCGACGA 
TGAGCTGGCC GCCCGGCCGG GTGGGCAGCG 
ATAGGTGCTG TGCGGGGACA GGAAGATGGT 
GTGCGCTGGG CTAGGACCTT GCCCCGCAGC 
GTTGCATGGG ATGGGGGGTG GGCACATAGA 
CCATGCACTT GGTTGTTTGT CGTTTCTGCT 

MINT26 (SEQ ID NO: 26) 
CCCGGGCTCC GGCATAGCTC TAGATTAACG 
CCAGCGTCGG TGCACGGCCG GGGTTCTTAG 
AGCGAAATCC AAGAACGACA CCGGCCGGCG 
AGATGGCCAA GGGCGGCTTG CGGAGCCCAG 
TGCAAAGCTC TGCCTTGAAC CACGTAATTC 
GGCCACATTT CCCGAAGGTT ATTAAATGGA 
ATGTACAGGA AGATAAGAGG GAAACTCTTC 
AATGCGGGGA GGTTAAACAA AGCTTCTCCT 
TTAAAAGACT CCCGTCTCNG GAGCAGACGC 
GAAANGATGC CCTCGGCCCC TTGCCAGAGG 



GGAGAGTGAA GGTGGAGGCA GTCACACCTA 
TGACGAGCCA GGGGCTTATA TAGGTGCGAC 
TGCGAAGAGG GCATTGCTCT TTCTGATGGC 
GGCCCCGAGA CGGGCTGGAG GGTGGGGACA 
GGGAGGTTTG GGTCTTCTGA GACGCTCCAG 
GAGGCGAATG TGACGCTTAG GGTCGCTACG 
ATTAATTCTT GTCATCGGGA GGTTTCGCGG 



GGCGGCCAGA TCTCAGGCCC TGCGTGCCCG 
CATGGACTCT CGGAGGGCAC TCCTGGGGGG 
CATTGCACGC TGCATAATCG CCGCCACAAA 
GGACAGAAAA AGTTCCTAGG TCTCCGCAGG 
CATGTTGGGA GCCCCTTCTC CCCCCGAGAG 
GGTCGCGCCA CGCGGGTCCC TTCCCTACCA 
CGAGTTGCTC TGCGGAATCC CGGG 



GGATGTGAAC CGGGGAAGGC AGCTGGGGCT 
GCTGGAAAGC GAGCCAAGAG GAGGGCAAGG 
AAAGGGTGTC TTGGCGGTGG CCTTGGTAAG 
CACTGAAAGT ATTGAAGTCA GAGCCTTGGG 
CGGGGACAGA CCTCAGCCCG TGAGCCTTGA 
AGCCTTGAGC TCCACGCGGG GACAGACCTC 
CAGACCTCAG CCCGTGAGCC TTGANCCCAG 
GTGGCCTGGA ATGTTANGGA AACCCCAGCC 
NGCTCGGNCC GGG 



GCGCTGCGCG AAAGCGAAAG CCGCCCGCCA 
GGAGCTGTGC CGTGCCGCTC TTGGGGATGG 
CGTCCGGGCG CGGTGCTTCG CTAGCTATAA 
TCCGGCCCTT TACAAGCACC GGCCCGTTAT 
GGAGTGGGAG GAGTGAGGTT AGGGGTAACG 
GCCTACAGCA GAGTTGGCGG CGGGGCTCTC 
TTTCCCGGG 



AGCTGGGCGA CGGGGGCGGG GGCAGCATGC 
ACATCACAAA CTGTGGAGCG ATACATTGGA 
TGTTTCTGAT GTAGTCGTGA TTAGTGTTGG 
GGAACGCAGC GAGCCAGGCC CGCGCCCCCC 
AGGCACCCAG GGTGTCCCTC CCTAGGTCCT 
GAA.TTCAGCA GTGGAGTTAG AGACGGACGA 
CTCATTTGCT TTAGGGGGTT GTGCTGGGAA 
ANGACTCCAC GATTCATTTC TAACAACTTC 
NCCCTCCCCG CTCTCTAAGC CCCGCTGCAT 
GCCGGGTCCG GGATTCCCGG G 
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MINT27 (SEQ ID NO: 27) 

CCCGGGATCC GGGAAGGCTC CCCCGAGCCG GGGTCGGAAC TGCGGCTGGA GGGGCCTCGG 
CTTGAGGAGG ATCTGGGAGG GCGGGGGCTC AGTCCTGGCC ACCAGGTGTG AGGGGTCGGG 
TGCGGAGCCC TGTGTCAGAC GCGGCGGTGA AGGCTGTAGC CCTGCTCTCC GGGATGGGGG 
TGGTACTCTC ACCGCACCCT GCCCCAAGCC GCAGGGAGCC CCTGGCGCCC CTGGCGCCCG 
GG 

MINT28 (SEQ ID NO: 28) 

CCCGGGCCCC TGGCGGGGAC ACCGGGGGGG CGCCGGGGGC CTCCCACTTA TTCTACACCT 
CTCATGTCTC TTCACCGTGC CAGACTAGAG TCAAGCTCAA CAGGGTCTTC TTTCCCCGCT 
GATTCCGCCA AGCCCGTTCC CTTGGCTGTG GTTTCGCTGG ATAGTAGGTA GGGACAGTGG 
GAATCTCGTT CATCCATTCA TGCGCGTCAC TAATTAGATG ACGAGGCATT TGGCTACCTT 
AAGAGAGTCA TAGTTACTCC CGCCGTTTAC CCGCGCTTCA TTGAATTTCT TCACTTTGAC 
ATTCAGAGCA CTGGGCAGAA ATCACATCGC GTCAACACCC GCCGCGGGCC TTCGCGATGC 
TTTGTTTTAA TTAAACAGTC GGATTCCCCT GGTCCGCACC AGTTCTAAGT CGGCTGCTAG 
GCGCCGGCCG ANGCGAAGCG CCGCGCGGAA CCGCGGGCCC GGG 



MINT29 (SEQ ID NO: 29) 

CCCGGGAGCT GACCGTGGGG AGGCCGGTTC CGCTGGTTTC AACCAGCCCA . CTCTCCCCTC 
TTGGGATGCC CAACCCCGCG TACTCACCAT TTCCTGGTTT CCAGGATGTC CTGGCATCTT 
AGCTATGCAT TTCCAGTACC TCCAGACCTC AGGGCAACAA AGGATGTGAC AAAGTCACCT 
AGTGCTCCTG AGGGGCAACA CGCGGGACAG TAGAGATGGA TCTCAGGCTC CGGCCTGAGC 
CAGAGACAAA GGCCGCGCCA AACGCTGGAA GCCACGCCCT CCTCCCCAAC TGCGTGCCTG 
ATAGGACGGT TCCTACTCTG ACAGATTGAA TAAGGCTCCA GGACCCTCGC CCACACCCAC 
CGTCCCCAGC ATTAGTGCGC TTTATGGACA GGGAAACGGG ATCCTGTANG CGGGGTCACA 
CGCCCCGGG 

mint30 (seq id no:30) 

cccgggcacc tgggctgggg gggcactcac atggctaccg gaggccccca cgtgcggcgc 
cccgcggaga caggggttcg cgttcagagc tggtggcgga tggaccaggt ggccgcgggg 
accagctggg tccagatgtg ctgggcctgc tggaagggga caggtgctac ctggacgtgt 
aatggccctt ggtctctttt ggccgaacct gccgctccga tccccctcca tcccttcatc 
cctccatccc tccatccctc catccctccg gcccttctcc ccttcttcct ccgctgcctg 
tgttgcagag aggggctgtc agagactgtt gatgtgggaa aaaatgaaat gggggagggg 
ttgtgattgg caaaggccag ttgtgccggg agcggtgggt agagggggtg ccctgagagt 
gggaagccct aaacttggag ggcagcgctg atggggagag ggttcctggc acccccacct 
gccttggaag tgggaaatga catagcggga ggggggctgc agttccagcc cccggg 

mint31 (seq id no: 31) 

cccggggcct ctatcctggc gggaagggca ggccgacccg gcagactgcg gcctctcggg 
agggaagaag gtgtcagacg cgcggagcaa ccataaatag cccccctttc ccagaagacg 
gcacggggtt caagactcag gcgccgcata ctcagaatga gagcagagac tcccgccagg 
aaaaaagggc acttagggga tctgctcatt aacatgaaat gcaaatgagc ccgcccggcc 
tcatttacac aactctgtgc atggattcgg cgaaagggca accagggaga cgacggcgca 
gcagccactc tgccacttcc cccatcccct cccccccatc ggccggggcg ggaactgaga 
cgaccccaac cctctgcggc ggcgggaggt gcgcgggggc tgcgtgggtg gtgcagcctt 
aggggagtga acaacgccca ggggtgatgg cctcagcaaa gtgaggggtg gtgatggagg 
tcatccgacc catcccgccg cctctccgca gtggcgcaag cgccccaaaa tctccggaga 
nggaactgag tgacccacta ggttccgccg tgtctacctc tcgcagatgt tggggaagtg 
cttcccggcg tctaatcctc gctgttcccc cctccaccgg cgcccagcac acccgcggcg 

CTCCGCTCCC GGG 
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MINT32 (SEQ ID NO: 32) 
CCCGGGTACC TGCACAGCTC GCTCCCTCCC 
CCTCGGTGAG GCCTTCCCTG GACAACGCAT 
CTTCCAGGCG CGCAGCCGAA GCCCAGTGCC 
TCCCGAAAAC AGCCTCTGAG GGGTCCTCTG 
CTCGTTGACT AGCTCTTGAG AGGAGTGGCT 
GACTCCAGGA GAGTGTAATT TACAAAGGCG 
GGACTCTGCG CCGGACGCTT CGCCCGCCCT 
CGCGGGTCCG GAGAACCTCT GAGCACCGGC 

MINT33 (SEQ ID NO: 33) 
CCCGGGCAGA AAGGCTCGGA TGGCGGTGGC 
CCTGGTCCCT CCGGGTCACT GTCGGCTAAT 
GAAGAAGTCA GCGCCCGGG 



ATCCTTCGGG TCTTCGCTCG AACGTCCGCT 
TTGAAACGTA ACCCCAAGGC AAGAAGCCAC 
AAGGAGGCCG GAGACTCGGG TGCCCGCGCA 
AGCATCCTTC CAGCGTGTTT GG.GAGGCAAA 
AGAGGAATCC AGGCGGGGAA GGGGACGGTG 
GGGGGCGGGG ACGCCCAGGT CCGAGTCCCA 
TTCAGGTCCC CTGCCCGGTC CTCGTACCCG 
CCCCAGCCCC CGGG 



AGAAAGGCTC GGAGGCGGTG GCCTCAGATC 
TCTGGGGGAA GGACTGGGCA AGGCTGTTTG 
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